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Description 

Insect-Resistant Transgenic Plants and Methods for 
Improving 5-Endotoxin activity against Target Insects 

5 

1.0 Background of the Invention 

1.1 Field of the Invention 

This invention relates to methods for producing genetically-engineered, re- 
combinant 5-endotoxins derived from Bacillus thuringiensis that are useful in the 
10 control of southern corn rootworm (Diabrotica undecimpunctata howardi Barber) 
and western corn rootworm {Diabrotica virgifera virgifera LeConte). 

1.2 Description of the Related Art 

Almost all field crops, plants, and commercial farming areas are susceptible 

1 5 to attack by one or more insect pests. Particularly problematic are Coleopteran and 
Lepidoptern pests. For example, vegetable and cole crops such as artichokes, kohl- 
rabi, arugula, leeks, asparagus, lentils, beans, lettuce (e.g., head, leaf, romaine), 
beets, bok choy, malanga, broccoli, melons (e.g., muskmelon, watermelon, Cren- 
shaw, honeydew, cantaloupe), brussels sprouts, cabbage, cardoni, carrots, napa, 

20 cauliflower, okra, onions, celery, parsley, chick peas, parsnips, chicory, peas, Chi- 
nese cabbage, peppers, collards, potatoes, cucumber, pumpkins, cucurbits, rad- 
ishes, dry bulb onions, rutabaga, eggplant, salsify, escarole, shallots, endive, soy- 
bean, garlic, spinach, green onions, squash, greens, sugar beets, sweet potatoes, 
turnip, swiss chard, horseradish, tomatoes, kale, turnips, and a variety of spices are 

25 sensitive to infestation by one or more of the following insect pests: alfalfa looper, 
armyworm, beet armyworm, artichoke plume moth, cabbage budworm, cabbage 
looper, cabbage webworm, corn earworm, celery leafeater, cross-striped cabbage- 
worm, european corn borer, diamondback moth, green cloverworm, imported cab- 
bageworm, melonworm, omnivorous leafroller, pickleworm, rindworm complex, 

30 saltmarsh caterpillar, soybean looper, tobacco budworm, tomato fruitworm, tomato 
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hornworm, tomato pinworm, velvetbean caterpillar, and yellowstriped armyworm. 
Likewise, pasture and hay crops such as alfalfa, pasture grasses and silage are often 
attacked by such pests as armyworm, beef armyworm, alfalfa caterpillar, European 
skippef, a variety of loopers and webworms, as well as yellowstriped armyworms. 
5 Fruit and vine crops such as apples, apricots, cherries, nectarines, peaches, 

pears, plums, prunes, quince almonds, chestnuts, filberts, pecans, pistachios, wal- 
nuts, citrus, blackberries, blueberries, boysenberries, cranberries, currants, logan- 
berries, raspberries, strawberries, grapes, avocados, bananas, kiwi, persimmons, 
pomegranate, pineapple, tropical fruits are often susceptible to attack and defolia- 

10 tion by achema sphinx moth, amorbia, armyworm, citrus cutworm, banana skipper, 
blackheaded fireworm, blueberry leafroller, cankerworm, cherry fruitworm, citrus 
cutworm, cranberry girdler, eastern tent caterpillar, fall webworm, fall webworm, 
filbert leafroller, filbert webworm, fruit tree leafroller, grape berry moth, grape 
leaffolder, grapeleaf skeletonizer, green fruitworm, gummosos-batrachedra com- 

15 mosae, gypsy moth, hickory shuckworm, hornworms, loopers, navel orangeworm, 
obliquebanded leafroller, omnivorous leafroller. omnivorous looper, orange tortrix, 
orangedog, oriental fruit moth, pandemis leafroller, peach twig borer, pecan nut 
casebearer, redbanded leafroller, redhumped caterpillar, roughskinned cutworm, 
saltmarsh caterpillar, spanworm, tent caterpillar, thecla-thecla basillides, tobacco 

20 budworm, tortrix moth, tufted apple budmoth, variegated leafroller, walnut cater- 
pillar, western tent caterpillar, and yellowstriped armyworm. 

Field crops such as canola/rape seed, evening primrose, meadow foam, corn 
(field, sweet, popcorn), cotton, hops, jojoba, peanuts, rice, safflower, small grains 
(barley, oats, rye, wheat, etc), sorghum, soybeans, sunflowers, and tobacco are of- 

25 ten targets for infestation by insects including armyworm, asian and other corn 
borers, banded sunflower moth, beet armyworm, bollworm, cabbage looper, corn 
rootworm (including southern and western varieties), cotton leaf perforator, dia- 
mondback moth, european corn borer, green cloverworm, headmoth, headwonn, 
imported cabbageworm, loopers (including Anacamptodes spp.), obliquebanded 

30 leafroller, omnivorous leaftier, podworm, podworm, saltmarsh caterpillar, south- 
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western corn borer, soybean looper, spotted cutworm, sunflower moth, tobacco 
budworm, tobacco hornworm, velvetbean caterpillar, 

Bedding plants, flowers, ornamentals, vegetables and container stock are 
frequently fed upon by a host of insect pests such as armyworm, azalea moth, beet 
5 armyworm, diamondback moth, ello moth (hornworm), Florida fern caterpillar, Io 
moth, loopers, oleander moth, omnivorous leafroller, omnivorous looper, and to- 
bacco budworm. 

Forests, fruit, ornamental, and nut-bearing trees, as well as shrubs and other 
nursery stock are often susceptible to attack from diverse insects such as bagworm, 

10 blackheaded budworm, browntail moth, California oakworm, douglas fir tussock 
moth, elm span worm, fall webworm, fruittree leafroller, greenstriped mapleworm, 
gypsy moth, jack pine budworm, mimosa webworm, pine butterfly, redhumped 
caterpillar, saddleback caterpillar, saddle prominent caterpillar, spring and fall can- 
kerworm, spruce budworm, tent caterpillar, tortrix, and western tussock moth. 

15 Likewise, turf grasses are often attacked by pests such as armyworm, sod 
webworm, and tropical sod webworm. 

Because crops of commercial interest are often the target of insect attack, 
environmentally-sensitive methods for controlling or eradicating insect infestation 
are desirable in many instances. This is particularly true for farmers, nurserymen, 

20 growers, and commercial and residential areas which seek to control insect popu- 
lations using eco-friendly compositions. 

The most widely used environmentally-sensitive insecticidal formulations 
developed in recent years have been composed of microbial pesticides derived 
from the bacterium Bacillus thuringiensis. B. thuhngiensis is a Gram-positive 

25 bacterium that produces crystal proteins or inclusion bodies which are specifically 
toxic to certain orders and species of insects. Many different strains of 
B. thuringiensis have been shown to produce insecticidal crystal proteins. Com- 
positions including B. thuringiensis strains which produce insecticidal proteins 
have been commercially-available and used as environmentally-acceptable insec- 

30 ticides because they are quite toxic to the specific target insect, but are harmless to 
plants and other non-targeted organisms. 
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1.2.1 8-Endotoxins 

S-endotoxins are used to control a wide range of leaf-eating caterpillars and 
beetles, as well as mosquitoes. These proteinaceous parasporal crystals, also re- 

5 ferred to as insecticidal crystal proteins, crystal proteins, Bt inclusions, ciystaline 
inclusions, inclusion bodies, and Bt toxins, are a large collection of insecticidal 
proteins produced by B. thuringiensis that are toxic upon ingestion by a susceptible 
insect host. Over the past decade research on the structure and function of 
B. thuringiensis toxins has covered all of the major toxin categories, and while 

10 these toxins differ in specific structure and function, general similarities in the 
structure and function are assumed. Based on the accumulated knowledge of 
B. thuringiensis toxins, a generalized mode of action for B. thuringiensis toxins has 
been created and includes: ingestion by the insect, solubilization in the insect mid- 
gut (a combination stomach and small intestine), resistance to digestive enzymes 

15 sometimes with partial digestion actually "activating" the toxin, binding to the 
midgut cells, formation of a pore in the insect cells and the disruption of cellular 
homeostasis (English and Slatin, 1992). 

1.2.2 Genes Encoding Crystal Proteins 

20 Many of the 8-endotoxins are related to various degrees by similarities in 

their amino acid sequences. Historically, the proteins and the genes which encode 
them were classified based largely upon their spectrum of insecticidal activity. The 
review by Hofte and Whiteley (1989) discusses the genes and proteins that were 
identified in B. thuringiensis prior to 1990, and sets forth the nomenclature and 

25 classification scheme which has traditionally been applied to B. thuringiensis genes 
and proteins, cryl genes encode lepidopteran-toxic Cryl proteins, cryll genes en- 
code Cryll proteins that are toxic to both lepidopterans and dipterans. crylll genes 
encode coleopteran-toxic Crylll proteins, while crylV genes encode dipteran-toxic 
CiylV proteins. 

30 Based on the degree of sequence similarity, the proteins were further clas- 

sified into subfamilies; more highly related proteins within each family were as- 
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signed divisional letters such as CrylA, CrylB, CrylC, etc. Even more closely re- 
lated proteins within each division were given names such as CrylCl , CrylC2, etc. 

Recently a new nomenclature was developed which systematically classi- 
fied the Cry proteins based upon amino acid sequence homology rather than upon 
5 insect target specificities. The classification scheme for many known toxins, not 
including allelic variations in individual proteins, is summarized in Table 1. 

Table 1 

Known B. thurjngiensjs 5-Endotoxins, Genbank Accession Numbers, 
10 and Revised Nomenclature ' 



New 


Old 


GenBank Accession # 


CrylAal 


CrylA(a) 


Ml 1250 


CrylAa2 


CrylA(a) 


M10917 


CrylAa3 


CrylA(a) 


D00348 


CrylAa4 


CrylA(a) 




CrylAa5 


CrylA(a) 


T"\1 "7C1 {>*> 

D\ 751 si 


CrylAa6 


CryIA(a) 


U43605 


CrylAbl 


CrylA(b) 


Ml 3898 


CrylAb2 


CrylA(b) 


M12661 


CrylAb3 


CrylA(b) 


M15271 


CrylAb4 


CrylA(b) 


D00117 


CrylAb5 


CrylA(b) 


X04698 


CrylAb6 


CrylA(b) 


M37263 


CrylAb7 


CrylA(b) 


XI 3233 


CrylAb8 


CrylA(b) 


Ml 6463 


CrylAb9 


CrylA(b) 


X54939 


CrylAblO 


CrylA(b) 


A29125 


CiylAcl 


CryIA(c) 


Ml 1068 


CrylAc2 


CrylA(c) 


M35524 


CrylAc3 


CrylA(c) 


X54159 


CrylAc4 


CrylA(c) 


M73249 
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Table 1 (cont'd) 



New 


Old 


GenBank Accession # 


CrylAc5 


CrylA(c) 


M73248 


Cry 1 Ac 6 


CrylA(c) 


U43606 


CrylAc7 


CrylA(c) 


U87793 


CrylAc8 


CrylA(c) 


U87397 


CrylAc9 

✓ 


CrylA(c) 


U89872 


CrylAclO 


CrylA(c) 


AJ002514 


CrylAdl 


CrylA(d) 


M73250 


CrylAel 


CryIA(e) 


M65252 


CrylBal 


CrylB 


X06711 


CrylBa2 




X95704 


CiylBbl 


ET5 


L32020 


CrylBcl 


Crylb(c) 


Z46442 


CrylBdl 


CryEl 


U70726 


CrylCal 


CrylC 


X07518 


CrylCa2 


CrylC 


XI 3620 


CrylCa3 


CrylC 


M73251 


CrylCa4 


CrylC 


A27642 


CrylCaS 


CrylC 


X96682 


CrylCa6 


CrylC 


X96683 


CrylCa7 


CrylC 


X96684 


CrylCbl 


CrylC(b) 


M97880 


CrylDal 


CrylD 


X54160 


CrylDbl 


PrtB 


Z22511 


CrylEal 


CrylE 


X53985 


CrylEa2 


CrylE 


X56144 


CrylEa3 


CrylE 


M73252 


CrylEa4 




U94323 


CrylEbl 


CrylE(b) 


M73253 


CrylFal 


CrylF 


M63897 
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Table 1 (cont'd) 



New 


Old 


GenBank Accession # 


CrylFa2 


CrylF 


M63897 


CrylFbl 


PrtD 


Z22512 


CrylGal 


PrtA 


Z22510 


CrylGa2 


CrylM 


Y09326 


CrylGbl 


CryH2 


U70725 


CrylHal 


PrtC 


Z22513 


CrylHbl 




U35780 


Cryllal 


CryV 


X62821 


Crylla2 


CryV 


M98544 


Crylla3 


CryV 


L36338 


Crylla4 


CryV 


L49391 


CryllaS 


CryV 


Y08920 


Cryllbl 


CryV 


U07642 


CrylJal 


ET4 


L32019 


CrylJbl 


ET1 


U31527 


CrylKal 




U28801 


Cry2Aal 


CryllA 


M31738 


Cry2Aa2 


CryllA 


M23723 


Cry2Aa3 




D86084 


Cry2Abl 


CryllB 


M23724 


Cry2Ab2 


CryllB 


X55416 


Cry2Acl 


CryllC 


X57252 


Cry3Aal 


CrylllA 


M22472 


Cry3Aa2 


CrylllA 


J02978 


Cry3Aa3 


CrylllA 


Y00420 


Cry3Aa4 


CrylllA 


M30503 


Cry3Aa5 


CrylllA 


M37207 


Cry3Aa6 


CrylllA 


U10985 


Cry3Bal 


CrylllB 


XI 71 23 
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Table 1 (cont'd) 



New 


Old 


GenBank Accession # 


Cry3Ba2 


CrylllB 


A07234 


Cry3Bbl 


CryIIIB2 


M89794 


Cry3Bb2 


CryIIIC(b) 


U31633 


Cry3Cal 


CrylllD 


X59797 


Cry4Aal 


CrylVA 


Y00423 


Cry4Aa2 


CrylVA 


D00248 


Cry4Bal 


CrylVB 


X07423 


Cry4Ba2 


CrylVB 


X07082 


Cry4Ba3 


CrylVB 


M20242 


Cry4Ba4 


CrylVB 


D00247 


CrySAal 


CryVA(a) 


L07025 


CrySAbl 


CryVA(b) 


L07026 


CrySBal 


PS86Q3 


U 19725 


Cry6Aal 


Cry VIA 


L07022 


Cry6Bal 


CryVIB 


L07024 


Cry7Aal 


CrylllC 


M64478 


Cry7Abl 


CrylllCb 


U04367 


Cry8Aal 


CrylllE 


U04364 


Cry8Bal 


CrylllG 


U04365 


Cry8Cal 


CrylllF 


U04366 


Cry9Aal 


CrylG 


X58120 


Cry9Aa2 


CrylG 


X58534 


Cry9Bal 


CrylX 


X75019 


Cry9Cal 


CrylH 


Z37527 


Cry9Dal 


N141 


D85560 


CrylOAal 


CrylVC 


Ml 2662 


CryllAal 


CrylVD 


M31737 


Cryl 1 Aa2 


CrylVD 


M22860 


CryllBal 


Jeg80 


X86902 
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Table 1 (cont'd) 





OIH 

KJIW 


UtnDdllK /\CCcsMUIl ft 


PV\; HAol 

cry iz/\ai 


cry yd 




cry i3/\ai 


PrvVP 

cry v c 


L»U /UZ j 


Ciy IHAai 


Pr\A/n 

cry v u 


T T1 1(K^ 
U i Jy jj 


cry i j/\ai 


34KDa 


M /044Z 


cry i oAai 


cbm71 


A74140 


cry i /Aai 


/»Vw*71 

cDm/ 1 


VQQA7C 
AW4 /o 


cry 1 oAa l 


CryUr 1 




cry iy Aai 


J ego j 


i ucsyzu 


cryzu Aa l 




U&Zj 1 5 


Cm") 1 A o 1 

cryziAai 




T19Q70 


f^mll Aol 

cryzz Aa i 




1j4->4/ 


cytl Aal 


cytA 


AU J 1 5Z 


cytl Aaz 


cytA 


X0453o 


cytl A&J 


cytA 




cytl Aa4 


cytA 




p,.*1 Akl 

Cytl AD 1 


cytM 




cytirmi 




U3 /lyo 


cytz Aa i 


cytts 


7 1 A\ A7 
ZJ414 / 


Cyt2Bal 


"CytB" 


U52043 


Cyt2Ba2 


"CytB" 


AF020789 


Cyt2Ba3 


"CytB" 


AF022884 


Cyt2Ba4 


"CytB" 


AF022885 


Cyt2Ba5 


"CytB" 


AF022886 


Cyt2Bbl 




U82519 



a Adapted from: 

http://epunix.biols.susx.ac.uk/Home/Neil_Crickmore/B^ 
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1.2,3 BlOlNSECTlClDE POLYPEPTIDE COMPOSITIONS 

The utility of bacterial crystal proteins as insecticides was extended beyond 
lepidopterans and dipteran larvae when the first isolation of a coleopteran-toxic 
B. thuringiensis strain was reported (Krieg et a!., 1983; 1984). This strain 

5 (described in U. S. Patent 4,766,203, specifically incorporated herein by reference), 
designated B. thuringiensis var. tenebrionis, is reported to be toxic to larvae of the 
coleopteran insects Agelastica alni (blue alder leaf beetle) and Leptinotarsa de- 
cemlineala (Colorado potato beetle). 

U. S. Patent 5,024, 837 also describes hybrid B. thuringiensis var. kurstaki 

10 strains which showed activity against lepidopteran insects. U. S. Patent 4,797,279 
(corresponding to EP 0221024) discloses a hybrid 5. thuringiensis containing a 
plasmid from 5. thuringiensis var. kurstaki encoding a lepidopteran-toxic crystal 
protein-encoding gene and a plasmid from B. thuringiensis tenebrionis encoding a 
coleopteran-toxic crystal protein-encoding gene. The hybrid B. thuringiensis strain 

15 produces crystal proteins characteristic of those made by both B. thuringiensis 
kurstaki and B. thuringiensis tenebrionis, U. S. Patent 4,91 0,01 6 (corresponding to 
EP 0303379) discloses a B. thuringiensis isolate identified as B. thuringiensis MT 
104 which has insecticidal activity against coleopterans and lepidopterans. 



20 1.2.4 Molecular Genetic Techniques Facilitate Protein Engineering 

The revolution in molecular genetics over the past decade has facilitated a 
logical and orderly approach to engineering proteins with improved properties. 
Site specific and random mutagenesis methods, the advent of polymerase chain 
reaction (PCR™) methodologies, and related advances in the field have permitted 
25 an extensive collection of tools for changing both amino acid sequence, and under- 
lying genetic sequences for a variety of proteins of commercial, medical, and agri- 
cultural interest. 

Following the rapid increase in the number and types of crystal proteins 
which have been identified in the past decade, researchers began to theorize about 
30 using such techniques to improve the insecticidal activity of various crystal pro- 
teins. In theory, improvements to S-endotoxins should be possible using the meth- 
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ods available to protein engineers working in the art, and it was logical to assume 
that it would be possible to isolate improved variants of the wild-type crystal pro- 
teins isolated to date. By strengthening one or more of the aforementioned steps in 
the mode of action of the toxin, improved molecules should provide enhanced ac- 
5 tivity, and therefore, represent a breakthrough in the field. If specific amino acid 
residues on the protein are identified to be responsible for a specific step in the 
mode of action, then these residues can be targeted for mutagenesis to improve per- 
formance 

1 0 1.2.5 Structural Analyses of Crystal Proteins 

The combination of structural analyses of B. thuringiensis toxins followed 
by an investigation of the function of such structures, motifs, and *the like has 
taught that specific regions of crystal protein endotoxins are, in a general way, re- 
sponsible for particular functions. 

15 Domain 1, for example, from Cry3Bb and Cry 1 Ac has been found to be re- 

sponsible for ion channel activity, the initial step in formation of a pore (Walters et 
al, 1993; Von Tersch et al, 1994). Domains 2 and 3 have been found to be re- 
sponsible for receptor binding and insecticidal specificity (Aronson et al, 1995; 
Caramori et al, 1991; Chen et al 1993; de Maagd et al, 1996; Ge et al, 1991; Lee 

20 et al., 1992; Lee et al, 1995; Lu et al, 1994; Smedley and Ellar, 1996; Smith and 
Ellar, 1994; Rajamohan et al, 1995; Rajamohan et al, 1996; Wu and Dean, 1996). 
Regions in domain 2 and 3 can also impact the ion channel activity of some toxins 
(Chen etal, 1993, Wolfersberger et al, 1996; Von Tersch et al, 1994). 

25 L3 Deficiencies in the Prior Art 

Unfortunately, while many laboratories have attempted to make mutated 
crystal proteins, few have succeeded in making mutated crystal proteins with im- 
proved lepidopteran toxicity. In almost all of the examples of genetically- 
engineered B. thuringiensis toxins in the literature, the biological activity of the 
30 mutated crystal protein is no better than that of the wild-type protein, and in many 
cases, the activity is decreased or destroyed altogether (Almond and Dean, 1993; 
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Aronson etal, 1995; Chen et ai, 1993, Chen etal, 1995; Ge et al y 1991;Kwake/ 
a/., 1995; Lu et al, 1994; Rajamohan et al, 1995; Rajamohan et al, 1996; Smed- 
ley and Ellar, 1996; Smith and Ellar, 1994; Wolfersberger et al, 1996; Wu and Ar- 
onson, 1992). 

5 For a crystal protein having approximately 650 amino acids in the sequence 

of its active toxin, and the possibility of 20 different amino acids at each position in 
this sequence, the likelihood of arbitrarily creating a successful new structure is 
remote, even if a general function to a stretch of 250-300 amino acids can be as- 
signed. Indeed, the above prior art with respect to crystal protein gene mutagenesis 

10 has been concerned primarily with studying the structure and function of the crys- 
tal proteins, using mutagenesis to perturb some step in the mode of action, rather 
than with engineering improved toxins. 

Collectively, the limited successes in the art to develop synthetic toxins 
with improved insecticidal activity have stifled progress in this area and con- 

15 founded the search for improved endotoxins or crystal proteins. Rather than fol- 
lowing simple and predictable rules, the successful engineering of an improved 
crystal protein may involve different strategies, depending on the crystal protein 
being improved and the insect pests being targeted. Thus, the process is highly 
empirical. 

20 Accordingly, traditional recombinant DNA technology is clearly not rou- 

tine experimentation for providing improved insecticidal crystal proteins. What are 
lacking in the prior art are rational methods for producing genetically-engineered 
B. thuringiensis crystal proteins that have improved insecticidal activity and, in 
particular, improved toxicity towards a wide range of lepidopteran insect pests. 

25 

2.0 Summary of the Invention 

The present invention seeks to overcome these and other drawbacks inher- 
ent in the prior art by providing genetically-engineered modified B. thuringiensis 
8-endotoxins (Cry*), and in particular modified Cry3 5-endotoxins (designated 
30 Cry3* endotoxins). Also provided are nucleic acid sequences comprising one or 
more genes which encode such modified proteins. Particularly preferred genes in- 
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elude cry3* genes such as cry3A*, cry3B*, and cry3C* genes, particularly cry3B* 
genes, and more particularly, cry3Bb* genes, that encode modified crystal proteins 
having improved insecticidal activity against target pests. 

Also disclosed are novel methods for constructing synthetic Cry3* proteins, 
5 synthetically-modified nucleic acid sequences encoding such proteins, and com- 
positions arising therefrom. Also provided are synthetic cry3* expression vectors 
and various methods of using the improved genes and vectors. In a preferred em- 
bodiment, the invention discloses and claims Cry3B* proteins and cry3B* genes 
which encode improved insecticidal polypeptides. 
10 In preferred embodiments, channel-forming toxin design methods are dis- 

closed which have been used to produce a specific set of designed Cry3Bb* toxins 
with improved biological activity. These improved Cry3Bb* proteins are listed in 
Table 2 along with their respective amino acid changes from wild-type (WT) 
Cry3Bb, the nucleotide changes present in the altered cry3Bb* gene encoding the 
15 protein, the fold increase in bioactivity over WT CrySBb, the structural site of the 
alteration, and the design method(s) used to create the new toxins. 

Accordingly, the present invention provides in an overall and general sense, 
mutagenized Cry 3 protein-encoding genes and methods of making and using such 
genes. As used herein the term "mutagenized cry3 gene(s) n means one or more 

20 cry 3 genes that have been mutagenized or altered to contain one or more nucleotide 
sequences which are not present in the wild type sequences, and which encode 
mutant Cry3 crystal proteins (Cry3*) showing improved insecticidal activity. Such 
mutagenized cry3 genes have been referred to in the Specification as cry3* genes. 
Exemplary cry3* genes include cry3A* cry3B* and cry3C* genes. 

25 Exemplary mutagenized Cry3 protein-encoding genes include cry3B genes. 

As used herein the term "mutagenized cry3B gene(s) M means one or more genes 
that have been mutagenized or altered to contain one or more nucleotide sequences 
which are not present in the wild type sequences, and which encode mutant Cry3B 
crystal proteins (Cry3B*) showing improved insecticidal activity. Such genes have 

30 been designated cry3B* genes. Exemplary cry3B* genes include cry3Ba* and 
cry3Bb* genes, which encode Cry3Ba* and Cry3Bb* proteins, respectively. 
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Likewise, the present invention provides mutagenized Cry 3 A protein- 
encoding genes and methods of making and using such genes. As used herein the 
term "mutagenized cry3A gene(s)" means one or more genes that have been 
mutagenized or altered to contain one or more nucleotide sequences which are not 
5 present in the wild type sequences, and which encode mutant Cry3A crystal pro- 
teins (Cry3A*) showing improved insecticidal activity. Such mutagenized genes 
have been designated as cry3A * genes. 

In similar fashion, the present invention provides mutagenized Cry3C pro- 
tein-encoding genes and methods of making and using such genes. As used herein 

10 the term "mutagenized cry3C gene(s)" means one or more genes that have been 
mutagenized or altered to contain one or more nucleotide sequences which are not 
present in the wild type sequences, and which encode mutant Cry3C crystal pro- 
teins (Cry3C*) showing improved insecticidal activity. Such mutagenized genes 
have been designated as cry3C* genes. 

1 5 Preferably the novel sequences comprise nucleic acid sequences in which at 

least one, and preferably, more than one, and most preferably, a significant num- 
ber, of wild-type cry3 nucleotides have been replaced with one or more nucleo- 
tides, or where one or more nucleotides have been added to or deleted from the na- 
tive nucleotide sequence for the purpose of altering, adding, or deleting the corre- 

20 sponding amino acids encoded by the nucleic acid sequence so mutagenized. The 
desired result, therefore, is alteration of the amino acid sequence of the encoded 
crystal protein to provide toxins having improved or altered activity and/or speci- 
ficity compared to that of the unmodified crystal protein. 

Examples of preferred Cry2Bb*-encoding genes include cry3Bb.60, 

25 cry3Bb. 11221 cry3Bb. 11222, cry3Bb. 11223, cry3Bb. 11224, cry3Bb. 11225, 
cry3Bb. 11226, cry3Bb. 11227, cry3Bb. 11228, cry3Bb. 11229, cry3BbJ1230, 
cry3Bb. 11231, cry3Bb. 11232, cry3Bb. 11233, cry3Bb. 11234, cry3Bb. 11235, 
cry3Bb.l!236, cry3Bb. 11237, cry3Bb.H238, cry3BbA1239, cry3Bb.ll241, 
cry3Bb.ll242, cry3Bb.U032, cry3Bb.l 1035, cry3Bb.H036, cry3Bb.ll046, 

30 cry3Bb. 11048, cry3Bb. 1 1 051, cry3Bb. 11057, cry3Bb. 11058, cry3Bb. 11081, 
cry3Bb.H082, cry3BbJ1083, cry3BbJ1084, cry3BbJ1095, and cry3BbJ 1098. 
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In a variety of illustrative embodiments, the inventors have shown remark- 
able success in generating toxins with improved insecticidal activity using these 
methods. In particular, the inventors have identified unique methods of analyzing 
and designing toxins having improved or enhanced insecticidal properties both in 
5 vitro and in vivo. 

in addition to modifications of Cry3Bb peptides, those having benefit of the 
present teaching are now also able to make mutations in a variety of channel-forming 
toxins, and particularly in crystal proteins which are related to Cry3Bb either func- 
tionally or structurally. In fact, the inventors contemplate that any B. thuringiensis 

10 crystal protein or peptide can be analyzed using the methods disclosed herein and 
may be altered using the methods disclosed herein to produce crystal proteins having 
improved insecticidal specificity or activity. Alternatively, the inventors contemplate 
that those of skill in the art having the benefit of the teachings disclosed herein will 
be able to prepare not only mutated Cry3 toxins with improved activity, but also 

1 5 other crystal proteins including all of those proteins identified in Table 1 , herein. In 
particular, the inventors contemplate the creation of Cry 3* variants using one or 
more of the methods disclosed herein to produce toxins with improved activity. For 
example, the inventors note Cry3A, Cry3B, and Cry3C crystal proteins (which are 
known in the art) may be modified using one or more of the design strategies em- 

20 ployed herein, to prepare synthetically-modified crystal proteins with improved 
properties. Likewise, one of skill in the art will even be able to utilize the teachings 
of the present disclosure to modify other channel forming toxins, including channel 
forming toxins other than R thuringiensis crystal proteins, and even to modify pro- 
teins and channel toxins not yet described or characterized. 

25 Because the structures for insecticidal crystal proteins show a remarkable 

conservation of protein tertiary structure (Grochulski et al. 9 1995), and because 
many crystal proteins show significant amino acid sequence identity to the Cry3Bb 
amino acid sequence within domain I, including proteins of the Cryl, Cry2, Cry3, 
Cry4, Cry5, Cry7, Cry8, Cry9, CrylO, Cryll, Cryl2, Cryl3, Cryl4, and Cryl6 

30 classes (Table 1), now in light of the inventors' surprising discovery, for the first 
time, those of skill in the art having benefit of the teachings disclosed herein will 
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be able to broadly apply the methods of the invention to modifying a host of crystal 
proteins with improved activity or altered specificity. Such methods will not only 
be limited to the insecticidal crystal proteins disclosed in Table 1 , but may also 
been applied to any other related crystal protein, including those yet to be identi- 
5 fied. 

In particular, the high degree of homology between Cry3A, Cry3B, and 
Cry3C proteins is evident in the alignment of the primary amino acid sequence of 
the three proteins (FIG. 17A, FIG. 17B, and FIG. 17C). 

As such, the disclosed methods may be now applied to preparation of 
10 modified crystal proteins having one or more alterations introduced using one or 
more of the mutational design methods as disclosed herein. The inventors further 
contemplate that regions may be identified in one or more domains of a crystal 
protein, or other channel forming toxin which may be similarly modified through 
site-specific or random mutagenesis to generate toxins having improved activity, or 
1 5 alternatively, altered specificity. 

In certain applications, the creation of altered toxins having increased ac- 
tivity against one or more insects is desired. Alternatively, it may be desirable to 
utilize the methods described herein for creating and identifying altered insecticidal 
crystal proteins which are active against a wider spectrum of susceptible insects. 
20 The inventors further contemplate that the creation of chimeric insecticidal crystal 
proteins comprising one or more of these mutations may be desirable for preparing 
"super" toxins which have the combined advantages of increased insecticidal ac- 
tivity and concomitant broad spectrum activity. 

In light of the present disclosure, the mutagenesis of one or more codons 
25 within the sequence of a toxin may result in the generation of a host of related in- 
secticidal proteins having improved activity. While exemplary mutations have 
been described for each of the design strategies employed in the present invention, 
the inventors contemplate that mutations may also be made in insecticidal crystal 
proteins, including the loop regions, helices regions, active sites of the toxins, re- 
30 gions involved in protein oligomerization, and the like, which will give rise to 
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functional bioinsecticidal crystal proteins. All such mutations are considered to 
fall within the scope of this disclosure. 

In one illustrative embodiment, mutagenized cry3Bb* genes are obtained 
which encode Cry3Bb* variants that are generally based upon the wild-type 
5 Cry3Bb sequence, but that have one or more changes incorporated into the amino 
acid sequence of the protein using one or more of the design strategies described 
and claimed herein. 

In these and other embodiments, the mutated genes encoding the crystal 
proteins may be modified so as to change about one, two, three, four, or five or so 

10 amino acids in the primary sequence of the encoded polypeptide. Alternatively 
even more changes from the native sequence may be introduced, such that the en- 
coded protein may have at least about 1% or 2%, or alternatively about 3% or 
about 4%, or even about 5% to about 10%, or about 10% to about 15%, or even 
about 15% to about 20% or more of the codons either altered, deleted, or otherwise 

15 modified. In certain situations, it may even be desirable to alter substantially more 
of the primary amino acid sequence to obtain the desired modified protein. In such 
cases the inventors contemplate that from about 25%, to about 50%, or even from 
about 50% to about 75%, or more of the native (or wild-type) codons either altered, 
deleted, or otherwise modified. Alternatively, mutations in the amino acid se- 

20 quences or underlying DNA gene sequences which result in the insertion or dele- 
tion of one or more amino acids within one or more regions of the crystal protein 
or peptide. 

To effect such changes in the primary sequence of the encoded polypep- 
tides, it may be desirable to mutate or delete one or more nucleotides from the nu- 

25 cleic acid sequences of the genes encoding such polypeptides, or alternatively, un- 
der certain circumstances to add one or more nucleotides into the primary nucleic 
acid sequence at one or more sites in the sequence. Frequently, several nucleotide 
residues may be altered to produce the desired polypeptide. As such, the inventors 
contemplate that in certain embodiments it may be desirable to alter only one, two, 

30 three, four, or five or so nucleotides in the primary sequence. In other embodi- 
ments, which more changes are desired, the mutagenesis may involve changing, 
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deleting, or inserting 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or even 20 or 
so nucleotide residues in the gene sequence. In still other embodiments, one may 
desire to mutate, delete, or insert 21, 22, 23, 24, 25, 26, 27, 28, 29, 30-40, 40-50, 
50-60, 60-70, 70-80, 80-90, or even 90-100, 150, 200, 250, 300, 350, 400, 450, or 
5 more nucleotides in the sequence of the gene in order to prepare a cry3* gene 
which produces a Cry3* polypeptide having the desired characteristics. In fact, 
any number of mutations, deletions, and/or insertions may be made in the primary 
sequence of the gene, so long as the encoded protein has the improved insecticidal 
activity or specificity characteristics described herein. 
10 Changing a large number of the codons in the nucleotide sequence of an 

endotoxin-encoding gene may be particularly desirable and often necessary to 
achieve the desired results, particularly in the situation of "plantizing" a DNA se- 
quence in order to express a DNA of non-plant origin in a transformed plant cell. 
Such methods are routine to those of skill in the plant genetics arts, and frequently 

15 many residues of a primary gene sequence will be altered to facilitate expression of 
the gene in the plant cell. Preferably, the changes in the gene sequence introduce 
no changes in the amino acid sequence, or introduce only conservative replace- 
ments in the amino acid sequence such that the polypeptide produced in the plant 
cell from the "plantized" nucleotide sequence is still fully functional, and has the 

20 desired qualities when expressed in the plant cell. 

Genes and encoded proteins mutated in the manner of the invention may 
also be operatively linked to other protein-encoding nucleic acid sequences, or ex- 
pressed as fusion proteins. Both N-terminal and C-terminal fusion proteins are 
contemplated. Virtually any protein- or peptide-encoding DNA sequence, or 

25 combinations thereof, may be fused to a mutated cry 3* sequence in order to encode 
a fusion protein. This includes DNA sequences that encode targeting peptides, 
proteins for recombinant expression, proteins to which one or more targeting pep- 
tides is attached, protein subunits, domains from one or more crystal proteins, and 
the like. Such modifications to primary nucleotide sequences to enhance, target, or 

30 optimize expression of the gene sequence in a particular host cell, tissue, or cellular 
localization, are well-known to those of skill in the art of protein engineering and 
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molecular biology, and it will be readily apparent to such artisans, having benefit 
of the teachings of this specification, how to facilitate such changes in the nucleo- 
tide sequence to produce the polypeptides and polynucleotides disclosed herein. 

In one aspect, the invention discloses and claims host cells comprising one 
5 or more of the modified crystal proteins disclosed herein, and in particular, cells of 
A thuringiensis strains EG1122L EG11222, EG] 1223, EG11224, EG11225, 
EG11226, EG11227, EG11228, EG11229, EG11230, EG11231, EG11232, 
EG11233, EG11234, EG11235 ? EG11236, EG11237, EG11238, EG11239, 
EG1124L EG11242, EG11032, EG11035, EG11036, EG11046, EG11048, 
10 EG11051, EG11057, EG11058. EG11081, EG11082, EG11083, EG11084, 
EG11095, and EG11098 which comprise recombinant DNA segments encoding 
synthetically-modified Cry3Bb* crystal proteins which demonstrates improved in- 
secticidal activity. 

Likewise, the invention also discloses and claims cell cultures of 
15 A thuringiensis EG11221, EG11222. EG11223, EG11224, EG11225, EG11226, 
EG11227, EG11228, EG11229, EG11230, EG11231, EG11232, EG11233, 
EG11234, EG11235, EG11236, EG11237, EG11238, EG11239, EG11241, 
EG11242, EG11032, EG11035, EG11036, EG11046, EG11048, EG11051, 
EG11057, EG11058, EG11081, EG11082, EG11083, EG11084, and EG11095, 
20 and 11098. 

Such cell cultures may be biologically-pure cultures consisting of a single 
strain, or alternatively may be cell co-cultures consisting of one or more strains. 
Such cell cultures may be cultivated under conditions in which one or more addi- 
tional A thuringiensis or other bacterial strains are simultaneously co-cultured with 

25 one or more of the disclosed cultures, or alternatively, one or more of the cell cul- 
tures of the present invention may be combined with one or more additional 
A thuringiensis or other bacterial strains following the independent culture of each. 
Such procedures may be useful when suspensions of cells containing two or more 
different crystal proteins are desired. 

30 The subject cultures have been deposited under conditions that assure that 

access to the cultures will be available during the pendency of this patent applica- 
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tion to one determined by the Commissioner of Patents and Trademarks to be enti- 
tled thereto under 37 C.F.R. §1.14 and 35 U.S.C. §122. The deposits are available 
as required by foreign patent laws in countries wherein counterparts of the subject 
application, or its progeny, are filed. However, it should be understood that the 
5 availability of a deposit does not constitute a license to practice the subject inven- 
tion in derogation of patent rights granted by governmental action. 

Further, the subject culture deposits will be stored and made available to 
the public in accord with the provisions of the Budapest Treaty for the Deposit of 
Microorganisms, i.e., they will be stored with all the care necessary to keep them 

1 0 viable and uncontaminated for a period of at least five years after the most recent 
request for the finishing of a sample of the deposit, and in any case, for a period of 
at least 30 (thirty) years after the date of deposit or for the enforceable life of any 
patent which may issue disclosing the cultures. The depositor acknowledges the 
duty to replace the deposits should the depository be unable to furnish a sample 

15 when requested, due to the condition of the deposits. All restrictions on the avail- 
ability to the public of the subject culture deposits will be irrevocably removed 
upon the granting of a patent disclosing them. 

Cultures shown in Table 3 were deposited in the permanent collection of 
the Agricultural Research Service Culture Collection, Northern Regional Research 

20 Laboratory (NRRL) under the terms of the Budapest Treaty. 

Table 3 

Strains of the Present Invention Deposited Under the Terms 
of the Budapest Treaty 
Strain Deposit Date Protein Accession Number 

(NRRL Number) 



EG11032 


5/27/97 


Cry3Bb. 11032 


B-21744 


EG11035 


5/27/97 


Cry3Bb.ll035 


B-21745 


EG 11036 


5/27/97 


Cry3Bb. 11036 


B-21746 


EG11037 


5/27/97 


Cry3Bb. 11037 


B-21747 


EG11046 


5/27/97 


Cry3Bb.ll046 


B-21748 
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Table 3 (Cont'd) 
Strain Deposit Date Protein ~~ Accession Number 

(NRRL Number) 



EG11048 


5/27/97 


Cry3Bb.ll048 


B-21749 


EG11051 


5/27/97 


Cry3Bb. 11051 


B-21750 


EG11057 


5/27/97 


Cry3Bb. 11057 


B-21751 


EGU058 


5/27/97 


Cry3Bb. 11058 


B-21752 


EG11081 


5/27/97 


Cry3Bb. 11081 


B-21753 


EG 11082 


5/27/97 


Cry3Bb. 11082 


B-21754 


EG11083 


5/27/97 


Cry3Bb.ll083 


B-21755 


EG11084 


5/27/97 


Cry 3Bb. 11084 


B-21756 


EG11095 


5/27/97 


Cry3Bb.ll095 


B-21757 


EG11204 


5/27/97 


Cry3Bb.ll204 


B-21758 


EG11221 


5/27/97 


Cry3Bb. 11221 


B-21759 


EG11222 


5/27/97 


Cry3Bb. 11222 


B-21760 


EG11223 


5/27/97 


Cry3Bb. 11223 


B-21761 


EG11224 


5/27/97 


Cry3Bb. 11224 


B-21762 


EG 11225 


5/27/97 


Cry3Bb.ll225 


B-21763 


EG11226 


5/27/97 


Cry3Bb. 11226 


B-21764 


EG 11227 


5/27/97 


Cry3Bb. 11227 


B-12765 


EG11228 


5/27/97 


Cry3Bb. 11228 


B-12766 


EG 11229 


5/27/97 


Cry3Bb.ll229 


B-21767 


EG11230 


5/27/97 


Cry 3Bb. 11230 


B-21768 


EG11231 


5/27/97 


Cry3Bb. 11231 


B-21769 


EG11232 


5/27/97 


Cry3Bb. 11232 


B-12770 


EG 11233 


5/27/97 


Cry3Bb. 11233 


B-21771 


EG 11234 


5/27/97 


Cry3Bb. 11234 


B-21772 


EG11235 


5/27/97 


Cry3Bb.ll235 


B-21773 


EG 11236 


5/27/97 


Cry3Bb.ll236 


B-21774 


EG 11237 


5/27/97 


Cry3Bb. 11237 


B-21775 


EG11238 


5/27/97 


Cry3Bb.ll238 


B-21776 
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Table 3 (Cont'd) 



Strain 


Deposit Date 


Protein 


Accession Number 
(NRRL Number) 


EG11239 


5/27/97 


Cry3Bb. 11239 


B-21777 


EG11241 


5/27/97 


Cry3Bb.ll241 


B-21778 


EG11242 


5/27/97 


Cry 3Bb. 11242 


B-21779 



Also disclosed are methods of controlling or eradicating an insect popula- 
tion from an environment. Such methods generally comprise contacting the insect 
5 population to be controlled or eradicated with an insecticidally-effective amount of 
a Cry3* crystal protein composition. Preferred Cry3* compositions include 
Cry3A*, Cry3B*, and Cry3C* polypeptide compositions, with Cry3B* composi- 
tions being particularly preferred. Examples of such polypeptides include proteins 
selected from the group consisting of Cry3Bb-60, Cry 3Bb.l 1221, Cry3Bb.l 1222, 

10 Cry3Bb.ll223, Cry3Bb.ll224, Cry3Bb. 11225, Cry3Bb.l 1226, Cry3Bb.l 1227, 
Cry3Bb.ll228, Cry3Bb.ll229, Cry3Bb.ll230, Cry3Bb.ll231, Cry3Bb.l 1232, 
Cry3Bb.ll233, Cry3Bb.ll234, Cry3Bb. 11235, Cry3Bb.ll236, Cry3Bb.l 1237, 
Cry3Bb. 11238, Cry3Bb. 11239, Cry3Bb. 11241, Cry3Bb.l 1242, Cry3Bb.l 1032, 
Cry3Bb.ll035, Cry3Bb.ll036, Cry3Bb.ll046, Cry3Bb.ll048, Cry3Bb.l 1051, 

15 Cry3Bb.ll057, Cry3Bb. 11058, Cry3Bb.l 1081, Cry3Bb,11082, Cry3Bb.l 1083, 
Cry3Bb.ll084, Cry3Bb.l 1095, and Cry3Bb.l 1098. 

In preferred embodiments, these Cry3Bb* crystal protein compositions 
comprise the amino acid sequence of any of SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6. SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14. SEQ ID 

20 NO: 1 6, SEQ ID NO: 1 8, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID 
NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID 
NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID 
NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID 

25 NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:100, SEQ ID NO:102 or SEQ 
ID NO: 108. 
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2.1 Methods for Producing Modified Cry* Proteins 

The modified Cry* polypeptides of the present invention are preparable by 
a process which generally involves the steps of obtaining a nucleic acid sequence 
5 encoding a Cry* polypeptide; analyzing the structure of the polypeptide to identify 
particular "target" sites for mutagenesis of the underlying gene sequence; introduc- 
ing one or more mutations into the nucleic acid sequence to produce a change in 
one or more amino acid residues in the encoded polypeptide sequence; and ex- 
pressing in a transformed host cell the mutagenized nucleic acid sequence under 

1 0 conditions effective to obtain the modified Cry* protein encoded by the cry* gene. 

Means for obtaining the crystal structures of the polypeptides of the inven- 
tion are well-known. Exemplary high resolution crystal structure solution sets are 
given in Section 9.0 of the disclosure, and include the crystal structure of both the 
Cry3A and Cry3B polypeptides disclosed herein. The information provided in 

1 5 Section 9.0 permits the analyses disclosed in each of the methods herein which rely 
on the 3D crystal structure information for targeting mutagenesis of the polypep- 
tides to particular regions of the primary amino acid sequences of the 5-endotoxins 
to obtain mutants with increased insecticidal activity or enhanced insecticidal 
specificity. 

20 A first method for producing a modified B. thuringiensis Cry3Bb 5- 

endotoxin having improved insecticidal activity or specificity disclosed herein 
generally involves obtaining a high-resolution 3D crystal structure of the endo- 
toxin, locating in the crystal structure one or more regions of bound water wherein 
the bound water forms a contiguous hydrated surfaces separated by no more than 

25 about 1 6 A; increasing the number of water molecules in this surface by increasing 
the hydrophobicity of one or more amino acids of the protein in the region; and 
obtaining the modified 8-endotoxin so produced. Exemplary 5-endotoxins include 
Cry3Bb.ll032, Cry3Bb.l 1227, Cry3Bb.l 1241, Cry3Bb.ll051, Cry3Bb.ll242, 
and Cry3Bb.ll098. 

30 A second method for producing a modified 5. thuringiensis Cry3Bb 8- 

endotoxin having improved insecticidal activity comprises identifying a loop re- 
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gion in a 5-endotoxin; modifying one or more amino acids in the loop to increase 
the hydrophobicity of the amino acids; and obtaining the modified S-endotoxin so 
produced. Preferred 5-endotoxinproduced by this method include Cry3Bb. 11241, 
Cry3Bb.ll242, Cry3Bb.l 1228, Cry3Bb.l 1229, Cry3Bb.l 1230, Cry3Bb.I1231, 
5 Cry3Bb.ll233, Cry3Bb.ll236, Cry3Bb.l 1237, Cry3Bb.ll238, and 
Cry3Bb.ll239. 

A method for increasing the mobility of channel forming helices of a 
B. thuringiensis Cry3B S-endotoxin is also provided by the present invention. The 
method generally comprises disrupting one or more hydrogen bonds formed be- 

10 tween a first amino acid of one or more of the channel forming helices and a sec- 
ond amino acid of the 5-endotoxin. The hydrogen bonds may be formed inter- or 
intramolecularly, and the disrupting may consist of replacing a first or second 
amino acid with a third amino acid whose spatial distance is greater than about 3 
A, or whose spatial orientation bond angle is not equal to 180±60 degrees relative 

15 to the hydrogen bonding site of the first or second amino acid. 5-endotoxins pro- 
duced by this method and disclosed herein include Cry3Bb.l 1222, Cry3Bb.l 1223, 
Cry3Bb.ll224, Cry3Bb.ll225, Cry3Bb.l 1226, Cry3Bb.ll227, Cry3Bb.l 1231, 
Cry3Bb.ll241, and Cry3Bb. 1 1 242, and Cry3Bb.ll098. 

Also disclosed is a method of increasing the flexibility of a loop region in a 

20 channel forming domain of a B. thuringiensis Cry3Bb 5-endotoxin. This method 
comprises obtaining a crystal structure of a Cry3Bb 5-endotoxin having one or 
more loop regions; identifying the amino acids comprising the loop region; and 
altering one or more of the amino acids to reduce steric hindrance in the loop re- 
gion, wherein the altering increases flexibility of the loop region in the 5- 

25 endotoxin. Examples of 5-endotoxins produced using this method include 
Cry3Bb.ll032, Cry3Bb.ll051, Ciy3Bb.l 1228, Cry3Bb.ll229, Cry3Bb.ll230, 
Cry3Bb.ll231, Cry3Bb.ll232, Ciy3Bb.l 1233, Cry3Bb.ll236, Cry3Bb.l 1237, 
Cry3Bb.ll238, Cry3Bb.ll239, Cry3Bb.l 1227, Cry3Bb.ll234, Cry3Bb.l 1241, 
Cry3Bb.l 1242, Cry3Bb.l 1036, and Cry3Bb.l 1098. 

30 Another aspect of the invention is a method for increasing the activity of a 

S-endotoxin, comprising reducing or eliminating binding of the S-endotoxin to a 
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carbohydrate in a target insect gut. The eliminating or reducing may be accom- 
plished by removal of one or more a helices of domain 1 of the 5-endotoxin, for 
example, by removal of a helices al, a2a/b, and a3. An exemplary 5-endotoxin 
produced using the method is Cry3Bb.60. 
5 Alternatively, the reducing or eliminating may be accomplished by replac- 

ing one or more amino acids within loop pi,a8, with one or more amino acids 
having increased hydrophobicity. Such a method gives rise to 5-endotoxins such 
as Cry3Bb.ll228, Cry3Bb.l 1230, Cry3B.11231 ? Cry3Bb.l 1237, and 
Cry3Bb.l 1098, which are described in detail, herein. 

10 Alternatively, the reducing or eliminating is accomplished by replacing one 

or more specific amino acids, with any other amino acid. Such replacements are 
described in Table 2, and in the examples herein. One example is the 5-endotoxin 
designated herein as Cry3Bb.l 1221. 

A method of identifying a region of a Cry3Bb 5-endotoxin for targeted 

15 mutagenesis comprising: obtaining a crystal structure of the 5-endotoxin; identify- 
ing from the crystal structure one or more surface-exposed amino acids in the pro- 
tein; randomly substituting one or more of the surface-exposed amino acids to ob- 
tain a plurality of mutated polypeptides, wherein at least 50% of the mutated 
polypeptides have diminished insecticidal activity; and identifying from the plural- 

20 ity of mutated polypeptides one or more regions of the Cry3Bb S-endotoxin for 
targeted mutagenesis. The method may further comprise determining the amino 
acid sequences of a plurality of mutated polypeptides having diminished activity, 
and identifying one or more amino acid residues required for insecticidal activity. 
In another embodiment, the invention provides a process for producing a 

25 Cry3Bb 5-endotoxin having improved insecticidal activity. The process generally 
involves the steps of obtaining a high-resolution crystal structure of the protein; 
determining the electrostatic surface distribution of the protein; identifying one or 
more regions of high electrostatic diversity; modifying the electrostatic diversity of 
the region by altering one or more amino acids in the region; and obtaining a 

30 Cry3Bb S-endotoxin which has improved insecticidal activity. In one embodiment, 
the electrostatic diversity may be decreased relative to the electrostatic diversity of 
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a native Cry3Bb S-endotoxin. Exemplary 8-endotoxins with decreased electro- 
static diversity include Cry3Bb.l 1227, Cry3Bb.l 1241, and Cry3Bb.l 1242. Alter- 
natively, the electrostatic diversity may be increased relative to the electrostatic 
diversity of a native Cry3Bb 6-endotoxin. An exemplary 5-endotoxin with in- 
5 creased electrostatic diversity is Cry3Bb. 1 1 234. 

Furthermore, the invention also provides a method of producing a Cry3Bb 
6-endotoxin having improved insecticidal activity which involves obtaining a 
high-resolution crystal structure; identifying the presence of one or more metal 
binding sites in the protein; altering one or more amino acids in the binding site; 
10 and obtaining an altered protein, wherein the protein has improved insecticidal ac- 
tivity. The altering may involve the elimination of one or more metal binding 
sites. Exemplary 5-endotoxin include Cry3Bb. 11222, Cry3Bb,l 1224, 
Cry3Bb.l 1225, and Cry3Bb.l 1226. 

A further aspect of the invention involves a method of identifying a 
15 B. thuringiensis Cry3Bb 5-endotoxin having improved channel activity. This 
method in an overall sense involves obtaining a Cry3Bb 5-endotoxin suspected of 
having improved channel activity; and determining one or more of the following 
characteristics in the 5-endotoxin, and comparing such characteristics to those ob- 
tained for the wild-type unmodified 5-endotoxin: (1) the rate of channel formation, 
(2) the rate of growth of channel conductance or (3) the duration of open channel 
state. From this comparison, one may then select a 5-endotoxin which has an in- 
creased rate of channel formation compared to the wildtype 5-endotoxin. Exam- 
ples of Cry3Bb 5-endotoxins prepared by this method include Cry3Bb.60, 
Cry3Bb.ll035, Cry3Bb.ll048, Cry3Bb.ll032, Cry3Bb.ll223, Cry3Bb.ll224, 
Cry3Bb.l 1226, Cry3Bb.l 1221, Cry3Bb.l 1242, Cry3Bb.l 1230, and 
Cry3Bb.ll098. 

Also provided is a method for producing a modified Cxy3Bb 6-endotoxin, 
having improved insecticidal activity which involves altering one or more non- 
surface amino acids located at or near the point of greatest convergence of two or 
more loop regions of the Cry3Bb 8-endotoxin, such that the altering decreases the 
mobility of one or more of the loop regions. The mobility may conveniently be 
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determined by comparing the thermal denaturation of the modified protein to a 
wild-type Cry3Bb 8-endotoxin, An exemplary crystal protein produced by this 
method is Cry3Bb.l 1095. 

A further aspect of the invention involves a method for preparing a modi- 
5 ficd Cry3Bb 5-endotoxin, having improved insecticidal activity comprising modi- 
fying one or more amino acids in the loop to increase the hydrophobicity of said 
amino acids; and altering one or more of said amino acids to reduce steric hin- 
drance in the loop region, wherein the altering increases flexibility of the loop re- 
gion in the endotoxin. Exemplary Cry3Bb 5-endotoxins produced is selected from 

10 the group consisting of Cry3Bb.l 1057, Cry3Bb.l 1058, Cry3Bb.l 1081, 
Cry3Bb.ll082, Cry3Bb.l 1083, Cry3Bb.l 1084, Cry3Bb.l 1231, Cry3Bb.l 1235, 
and Cry3Bb. 11098. 

The invention also provides a method of improving the insecticidal activity 
of a B. thuringiensis Cry3Bb 5-endotoxin, which generally comprises inserting one 

15 or more protease sensitive sites into one or more loop regions of domain 1 of the 5- 
endotoxin. Preferably, the loop region is cc3,4, and an exemplary 8-endotoxin so 
produced is Cry3Bb.l 1221. 



2.2 Polypeptide Compositions 

20 The crystal proteins so produced by each of the methods described herein 

also represent important aspects of the invention. Such crystal proteins preferably 
include a protein or peptide selected from the group consisting of Cry3Bb-60, 
Cry3Bb.ll221, Cry3Bb.ll222, Cry3Bb.l 1223, Cry3Bb,11224, Cry3Bb.l 1225, 
Cry3Bb.ll226, Cry3Bb.ll227, Cry3Bb.l 1228, Cry3Bb.ll229, Cry3Bb.l 1230, 

25 Ciy3Bb.ll231, Cry3Bb.ll232, Cry3Bb.l 1233, Cry3Bb.ll234, Ciy3Bb.l 1235, 
Cry3Bb.ll236, Cry3Bb.ll237, Cry3Bb.l 1238, Cry3Bb.l 1239, Cry3Bb.ll241, 
Cry3Bb.ll242, Cry3Bb.ll032, Cry3Bb.ll035, Cry3Bb.l 1036, Cry3Bb.l 1046, 
Cry3Bb.ll048, Cry3Bb.ll051, Cry3Bb.l 1057, Cry3Bb.ll058, Cry3Bb.l 1081, 
Cry3Bb. 1 1 082, Cry 3Bb. 11083, Cry 3Bb. 1 1 084, Cry3Bb. 1 1 095 , and 

30 Cry3Bb. 11098. 
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In preferred embodiments, the protein comprises a contiguous amino acid 
sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ 
ID NO:6. SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14. SEQ ID 
NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID 
5 NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID 
NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID 
NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID 
NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:100, SEQ ID NO:102, and 

10 SEQ ID NO: 108. 

Highly preferred are those crystal proteins which are encoded by the nu- 
cleic acid sequence of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO:5. SEQ ID NO:7, 
SEQ ID NO:9, SEQ ID NO. l 1, SEQ ID NO: 13. SEQ ID NO: 15, SEQ ID NO: 17, 
SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, 

15 SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, 
SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, 
SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, 
SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, 
SEQ ID NO:69, SEQ ID NO:99, SEQ ID NO: 101; or SEQ ID NO: 107, or a nucleic 

20 acid sequence which hybridizes to the nucleic acid sequence of SEQ ID NO: 1 , SEQ 
ID NO:3, SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, SEQ ID 
NO: 13. SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, SEQ ID 
NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID 
NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID 

25 NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID 
NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID 
NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID 
NO: 1 01 , or SEQ ID NO: 107 under conditions of moderate stringency. 

Amino acid, peptide and protein sequences within the scope of the present 

30 invention include, and are not limited to the sequences set forth in SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, 
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SEQ ID N0:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22 
SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, 
SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42 5 
SEQ ID NO:44, SEQ ID NO:46 SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, 
5 SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, 
SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68 5 SEQ IDNO JO, SEQ ID NO.IOO, 
SEQ ID NO: 102, and SEQ ID NO: 108, and alterations in the amino acid sequences 
including alterations, deletions, mutations, and homologs. 

Compositions which comprise from about 0.5% to about 99% by weight of 
10 the crystal protein, or more preferably from about 5% to about 75%, or from about 
25% to about 50% by weight of the crystal protein are provided herein. Such com- 
positions may readily be prepared using techniques of protein production and puri- 
fication well-known to those of skill, and the methods disclosed herein. Such a 
process for preparing a Cry3Bb* crystal protein generally involves the steps of 
15 culturing a host cell which expresses the Cry3Bb* protein (such as a 
B. thuhngiensis EG11221, EG11222, EG11223, EG11224, EG11225, EG11226, 
EG11227, EG11228, EG11229, EG11230, EG11231, EG11232, EG11233, 
EG11234, EG11235, EG11236, EGU237, EG11238, EG11239, EG11241, 
EG11242, EG11032, EG11035, EG11036, EG11046, EG11048, EG11051, 
20 EG11057, EG11058, EG11081, EG11082, EG11083, EG11084, EG11095, or 
EG1 1098 cell) under conditions effective to produce the crystal protein, and then 
obtaining the crystal protein so produced. 

The protein may be present within intact cells, and as such, no subsequent 
protein isolation or purification steps may be required. Alternatively, the cells may 
25 be broken, sonicated, lysed, disrupted, or plasmolyzed to free the crystal protein(s) 
from the remaining cell debris. In such cases, one may desire to isolate, concen- 
trate, or further purify the resulting crystals containing the proteins prior to use, 
such as, for example, in the formulation of insecticidal compositions. The com- 
position may ultimately be purified to consist almost entirely of the pure protein, or 
30 alternatively, be purified or isolated to a degree such that the composition com- 
prises the crystal protein(s) in an amount of from between about 0.5% and about 
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99% by weight, or in an amount of from between about 5% and about 95% by 
weight, or in an amount of from between about 1 5% and about 85% by weight, or 
in an amount of from between about 25% and about 75% by weight, or in an 
amount of from between about 40% and about 60% by weight etc. 

5 

2.3 Recombinant Vectors Expressing cry3 * Genes 

One important embodiment of the invention is a recombinant vector which 
comprises a nucleic acid segment encoding one or more of the novel 
B. thuringiensis crystal proteins disclosed herein. Such a vector may be transferred 

1 0 to and replicated in a prokaryotic or eukaryotic host, with bacterial cells being par- 
ticularly preferred as prokaryotic hosts, and plant cells being particularly preferred 
as eukaryotic hosts. 

In preferred embodiments, the recombinant vector comprises a nucleic acid 
segment encoding the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ 

15 ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID 
NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID 
NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID 
NO.-46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID 

20 NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID 
NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, SEQ ID NO: 102, or SEQ 
ID NO: 108. Highly preferred nucleic acid segments are those which have the se- 
quence of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID 
NO:9, SEQ ID NO:ll, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID 

25 NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID 
NO:29, SEQ ID NO:31, SEQ ID N0.33, SEQ ID NO:35, SEQ ID N0.37, SEQ ID 
NO:39, SEQ ID NO:4I, SEQ ID NO:43, SEQ ID NO:45, SEQ ID N0 47, SEQ ID 
NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID 
NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID 

30 NO:69, SEQIDNO:99, SEQ ID NO: 101, or SEQ ID NO: 107. 
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Another important embodiment of the invention is a transformed host cell 
which expresses one or more of these recombinant vectors. The host cell may be 
either prokaryotic or eukaryotic, and particularly preferred host cells are those 
which express the nucleic acid segment(s) comprising the recombinant vector 
5 which encode one or more B. thuringiensis crystal protein comprising modified 
amino acid sequences in one or more loop regions of domain 1, or between a helix 
7 of domain 1 and |3 strand 1 of domain 2. Bacterial cells arc particularly preferred 
as prokaryotic hosts, and plant cells are particularly preferred as eukaryotic hosts 

In an important embodiment, the invention discloses and claims a host cell 
10 wherein the modified amino acid sequences comprise one or more loop regions 
between a helices 1 and 2. a helices 2 and 3, a helices 3 and 4, a helices 4 and 5, 
a helices 5 and 6 or a helices 6 and 7 of domain 1 , or between a helix 7 of domain 
1 and p strand 1 of domain 2. A particularly preferred host cell is one that com- 
prises the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 
15 SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO: 12, SEQ ID N0.14, SEQ ID N0.16, 
SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, 
SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, 
SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, 
SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, 
SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, 
SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, SEQ ID NO: 102, or SEQ ID 
NO.T08, and more preferably, one that comprises the nucleic acid sequence of 
SEQ ID NO: 1 , SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 
NO:ll, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID 
NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID 
NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID 
NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID 
NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID 
NO:99, SEQ ID NO: 1 0 1 , or SEQ ID NO: 1 07. 



WO 99/31 248 PCT/US98/26852 

38 

Bacterial host cells transformed with a nucleic acid segment encoding a 
modified Cry3Bb crystal protein according to the present invention are disclosed 
and claimed herein, and in particular, a B. thuringiensis cell having designation 
EG11221, EG11222, EG11223, EG11224, EG11225, EG11226, EG11227, 
5 EG11228, EG11229, EG11230, EG11231, EG11232, EG11233, EG11234, 
EG11235, EG11236, EG11237, EG11238, EG11239, EG11241, EG11242, 
EG11032, EG11035, EG11036, EG11046, EG11048, EG11051, EG11057, 
EG1 1058, EG 1 1081, EG1 1082, EG1 1083, EG1 1084, EG1 1095, or EG 1 1098. 

In another embodiment, the invention encompasses a method of using a 
10 nucleic acid segment of the present invention that encodes a cry3Bb* gene. The 
method generally comprises the steps of: (a) preparing a recombinant vector in 
which the cry3Bb* gene is positioned under the control of a promoter; (b) introduc- 
ing the recombinant vector into a host cell; (c) culturing the host cell under condi- 
tions effective to allow expression of the Cry3Bb* crystal protein encoded by said 
1 5 cry3Bb* gene; and (d) obtaining the expressed Cry3Bb* crystal protein or peptide. 

A wide variety of ways are available for introducing a B. thuringiensis gene 
expressing a toxin into the microorganism host under conditions which allow for 
stable maintenance and expression of the gene. One can provide for DNA con- 
structs which include the transcriptional and translational regulatory signals for ex- 
20 pression of the toxin gene, the toxin gene under their regulatory control and a DNA 
sequence homologous with a sequence in the host organism, whereby integration 
will occur, and/or a replication system which is functional in the host, whereby in- 
tegration or stable maintenance will occur. 

The transcriptional initiation signals will include a promoter and a tran- 
25 scriptional initiation start site. In some instances, it may be desirable to provide for 
regulative expression of the toxin, where expression of the toxin will only occur 
after release into the environment. This can be achieved with operators or a region 
binding to an activator or enhancers, which are capable of induction upon a change 
in the physical or chemical environment of the microorganisms. For example, a 
30 temperature sensitive regulatory region may be employed, where the organisms 
may be grown up in the laboratory without expression of a toxin, but upon release 
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into the environment, expression would begin. Other techniques may employ a 
specific nutrient medium in the laboratory, which inhibits the expression of the 
toxin, where the nutrient medium in the environment would allow for expression of 
the toxin. For translational initiation, a ribosomal binding site and an initiation 
5 codon will be present. 

Various manipulations may be employed for enhancing the expression of 
the messenger RNA, particularly by using an active promoter, as well as by em- 
ploying sequences, which enhance the stability of the messenger RNA. The tran- 
scriptional and translational termination region will involve stop codon(s), a termi- 
10 nator region, and optionally, a polyadenylation signal A hydrophobic "leader" 
sequence may be employed at the amino terminus of the translated polypeptide se- 
quence in order to promote secretion of the protein across the inner membrane. 

In the direction of transcription, namely in the 5' to 3' direction of the cod- 
ing or sense sequence, the construct will involve the transcriptional regulatory re- 
15 gion, if any, and the promoter, where the regulatory region may be either 5' or 3' of 
the promoter, the ribosomal binding site, the initiation codon, the structural gene 
having an open reading frame in phase with the initiation codon, the stop codon(s), 
the polyadenylation signal sequence, if any, and the terminator region. This se- 
quence as a double strand may be used by itself for transformation of a microor- 
20 ganism host, but will usually be included with a DNA sequence involving a 
marker, where the second DNA sequence may be joined to the toxin expression 
construct during introduction of the DNA into the host. 

By a marker is intended a structural gene which provides for selection of 
those hosts which have been modified or transformed. The marker will normally 
25 provide for selective advantage, for example, providing for biocide resistance, e.g., 
resistance to antibiotics or heavy metals; complementation, so as to provide protot- 
ropy to an auxotrophic host, or the like. Preferably, complementation is employed, 
so that the modified host may not only be selected, but may also be competitive in 
the field. One or more markers may be employed in the development of the con- 
30 structs, as well as for modifying the host. The organisms may be further modified 
by providing for a competitive advantage against other wild-type microorganisms 
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in the field. For example, genes expressing metal chelating agents, e.g., sidero- 
phores, may be introduced into the host along with the structural gene expressing 
the toxin. In this manner, the enhanced expression of a siderophore may provide 
for a competitive advantage for the toxin-producing host, so that it may effectively 
5 compete with the wild-type microorganisms and stably occupy a niche in the envi- 
ronment. 

Where no functional replication system is present, the construct will also 
include a sequence of at least 50 basepairs (bp), preferably at least about 100 bp, 
more preferably at least about 1000 bp, and usually not more than about 2000 bp of 

10 a sequence homologous with a sequence in the host. In this way, the probability of 
legitimate recombination is enhanced, so that the gene will be integrated into the 
host and stably maintained by the host. Desirably, the toxin gene will be in close 
proximity to the gene providing for complementation as well as the gene providing 
for the competitive advantage. Therefore, in the event that a toxin gene is lost, the 

15 resulting organism will be likely to also lost the complementing gene and/or the 
gene providing for the competitive advantage, so that it will be unable to compete 
in the environment with the gene retaining the intact construct. 

A large number of transcriptional regulatory regions are available from a 
wide variety of microorganism hosts, such as bacteria, bacteriophage, cyanobacte- 

20 ria, algae, fungi, and the like. Various transcriptional regulatory regions include 
the regions associated with the trp gene, lac gene, gal gene, the X L and A, R promot- 
ers, the tac promoter, the naturally-occurring promoters associated with the 8- 
endotoxin gene, where functional in the host. See for example, U. S. Patents 
4,332,898; 4,342,832; and 4,356,270 (each of which is specifically incorporated 

25 herein by reference). The termination region may be the termination region nor- 
mally associated with the transcriptional initiation region or a different transcrip- 
tional initiation region, so long as the two regions are compatible and functional in 
the host. 

Where stable episomal maintenance or integration is desired, a plasmid will 
30 be employed which has a replication system which is functional in the host. The 
replication system may be derived from the chromosome, an episomal element 
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normally present in the host or a different host, or a replication system from a virus 
which is stable in the host. A large number of plasmids are available, such as 
pBR322, pACYCl 84, RSF1010, pR01614, and the like. See for example, Olson et 
al (1982); Bagdasarian et al (1981), Baum et al, 1990, and U. S. Patents 
5 4,356,270; 4,362,817; 4,371,625, and 5,441,884, each incorporated specifically 
herein by reference. 

The B. thuringiensis gene can be introduced between the transcriptional and 
translational initiation region and the transcriptional and translational termination 
region, so as to be under the regulatory control of the initiation region. This con- 

10 struct will be included in a plasmid, which will include at least one replication 
system, but may include more than one, where one replication system is employed 
for cloning during the development of the plasmid and the second replication sys- 
tem is necessary for functioning in the ultimate host. In addition, one or more 
markers may be present, which have been described previously. Where integration 

1 5 is desired, the plasmid will desirably include a sequence homologous with the host 
genome. 

The transformants can be isolated in accordance with conventional ways, 
usually employing a selection technique, which allows for selection of the desired 
organism as against unmodified organisms or transferring organisms, when pres- 
20 ent The transformants then can be tested for pesticidal activity. If desired, un- 
wanted or ancillary DNA sequences may be selectively removed from the recom- 
binant bacterium by employing site-specific recombination systems, such as those 
described in U. S. Patent 5,441,884 (specifically incorporated herein by reference). 

25 2.4 CRY3 DNA Segments 

A B. thuringiensis cry3* gene encoding a crystal protein having one or 
more mutations in one or more regions of the peptide represents an important as- 
pect of the invention. Preferably, the cry3* gene encodes an amino acid sequence 
in which one or more amino acid residues have been changed based on the meth- 
30 ods disclosed herein, and particularly those changes which have been made for the 
purpose of altering the insecticidal activity or specificity of the crystal protein. 
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In accordance with the present invention, nucleic acid sequences include 
and are not limited to DNA, including and not limited to cDNA and genomic 
DNA, genes; RNA, including and not limited to mRNA and tRNA; antisense se- 
quences, nucleosides, and suitable nucleic acid sequences such as those set forth in 
5 SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7 5 SEQ ID NO:9, SEQ ID 
NO: 1 1 , SEQ ID NO: 1 3, SEQ ID NO: 1 5, SEQ ID NO: 1 7, SEQ ID NO: 1 9, SEQ ID 
NO:21, SEQ ID NO:23 5 SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID 
NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID 
NO:41, SEQ ID NO:43, SEQ ID NO:45 ? SEQ ID NO:47, SEQ ID NO:49, SEQ ID 

10 NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID 
NO:99, SEQ ID NO:101 , or SEQ ID NO:107, and alterations in the nucleic acid se- 
quences including alterations, deletions, mutations, and homologs capable of ex- 
pressing the B. thuringiensis modified toxins of the present invention. 

15 As such the present invention also concerns DNA segments, that are free 

from total genomic DNA and that encode the novel synthetically-modified crystal 
proteins disclosed herein. DNA segments encoding these peptide species may 
prove to encode proteins, polypeptides, subunits, functional domains, and the like 
of crystal protein-related or other non-related gene products. In addition these 

20 DNA segments may be synthesized entirely in vitro using methods that are well- 
known to those of skill in the art. 

As used herein, the term "DNA segment" refers to a DNA molecule that has 
been isolated free of total genomic DNA of a particular species. Therefore, a DNA 
segment encoding a crystal protein or peptide refers to a DNA segment that con- 

25 tains crystal protein coding sequences yet is isolated away from, or purified free 
from, total genomic DNA of the species from which the DNA segment is obtained, 
which in the instant case is the genome of the Gram-positive bacterial genus, Bacil- 
lus, and in particular, the species of Bacillus known as B. thuringiensis. Included 
within the term "DNA segment", are DNA segments and smaller fragments of such 

30 segments, and also recombinant vectors, including, for example, plasmids, cos- 
mids, phagemids, phage, viruses, and the like. 
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Similarly, a DNA segment comprising an isolated or purified crystal pro- 
tein-encoding gene refers to a DNA segment which may include in addition to 
peptide encoding sequences, certain other elements such as, regulatory sequences, 
isolated- substantially away from other naturally occurring genes or protein- 
5 encoding sequences. In this respect, the term "gene" is used for simplicity to refer 
to a functional protein-, polypeptide- or peptide-encoding unit. As will be under- 
stood by those in the art, this functional term includes both genomic sequences, 
operon sequences and smaller engineered gene segments that express, or may be 
adapted to express, proteins, polypeptides or peptides. 

10 "Isolated substantially away from other coding sequences" means that the 

gene of interest, in this case, a gene encoding a bacterial crystal protein, forms the 
significant part of the coding region of the DNA segment, and that the DNA seg- 
ment does not contain large portions of naturally-occurring coding DNA, such as 
large chromosomal fragments or other functional genes or operon coding regions. 

15 Of course, this refers to the DNA segment as originally isolated, and does not ex- 
clude genes, recombinant genes, synthetic linkers, or coding regions later added to 
the segment by the hand of man. 

Particularly preferred DNA sequences are those encoding Cry3Bb.60, 
Cry3Bb.ll221, Cry3Bb.ll222, Cry3Bb.l 1223, Cry3Bb.l 1224, Cry3Bb.l 1225, 

20 Cry3Bb.ll226, Cry3Bb.U227, Cry3Bb.ll228, Cry3Bb.l 1229, Cry3Bb.ll230, 
Cry3Bb.ll231, Cry3Bb.ll232, Cry3Bb.ll233 5 Cry3Bb.l 1234, Cry3Bb.ll235, 
Cry3Bb.ll236, Cry3Bb.ll237, Cry3Bb.ll238, Cry3Bb.ll239, Cry3Bb.l 1241, 
Cry3Bb.ll242, Cry3Bb,11032, Cry3Bb.ll035, Cry3Bb.ll036, Cry3Bb.ll046, 
Cry3Bb.ll048, Cry3Bb.ll051, Cry3Bb.l 1057, Cry3Bb.ll058, Cry3Bb.l 1081, 

25 Cry3Bb.ll082, Cry3Bb.ll083, Cry3Bb.ll084, Cry3Bb.ll095 and Cry3Bb.ll098 
crystal proteins, and in particular cry3Bb* genes such as cry3Bb.60, cry3Bb. 11221, 
cry3BbM222, cry3Bb.U223, cry3BbM224, cry3Bb. 11225, cry3BbJ1226, 
cry3Bb.H22 7, cry3Bb. 11228, cry3Bb.l 1229, cry3Bb, 11230, cry3Bb. 11231, 
cry3Bb. 11232, cry3BbA1233, cry3Bb. 11234, cry3BbJ1235, cry3BbJ1236, 

30 cry3Bb.ll237, cry3Bb.H238, cry3Bb.l 1239, cry3Bb.U241 } cry3Bb.U242 t 
cry3Bbll032, cry3BbJ1035, cry3Bb. 11036, cry3Bb.l 1046, cry3BbA1048, 
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cry3Bb.H051, cry3Bb.l 1057, crySBb.l 1058, cry3Bb.l 1081, cry3Bb.l 1082, 
cry3Bb. 11083, cry3Bb.l 1084, cry3Bb. 11095 and cry3Bb.l 1098. In particular em- 
bodiments, the invention concerns isolated DNA segments and recombinant vec- 
tors incorporating DNA sequences that encode a Cry peptide species that includes 
5 within its amino acid sequence an amino acid sequence essentially as set forth in 
SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ 
ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ 
ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ 
ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ 
10 ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ 
ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ 
ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ 
ID NO: 100, SEQ ID NO: 102, or SEQ ID NO: 108. 

The term "a sequence essentially as set forth in SEQ ID NO:2, SEQ ID 

15 NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID 
NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID 
NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID 
NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID 
NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID 

20 NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID 
NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 1 00, SEQ ID 
NO: 102, or SEQ ID NO: 108" means that the sequence substantially corresponds to 
a portion of the sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID 
NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID 

25 NO: 1 8, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID 
NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID 
NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID 
NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID 
NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID 

30 NO:68, SEQ ID NO:70, SEQ ID NO: 1 00, SEQ ID NO: 1 02, or SEQ ID NO: 1 08, and 
has relatively few amino acids that are not identical to, or a biologically functional 
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equivalent of, the amino acids of any of these sequences. The term "biologically 
functional equivalent" is well understood in the art and is further defined in detail 
herein (e.g., see Illustrative Embodiments). 

Accordingly, sequences that have between about 70% and about 75% or 
5 between about 75% and about 80%, or more preferably between about 81% and 
about 90%, or even more preferably between about 91% or 92% or 93% and about 
97% or 98% or 99% amino acid sequence identity or functional equivalence to the 
amino acids of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ 
ID NO:10, SEQ ID NO: 12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ 

1 0 ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ 
ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ 
ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ 
ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ 
ID NO:60, SEQ ID N0.62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ 

15 ID NO:70, SEQ ID NO: 100, SEQ ID NO: 102 or SEQ ID NO: 108 will be sequences 
that are "essentially as set forth in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 
SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO: 14, SEQ ID NO: 16, 
SEQ ID NO:l 8, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, 
SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, 

20 SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, 
SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, 
SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, 
SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, SEQ ID NO: 102, or SEQ ID 
NO:108." 

25 It will also be understood that amino acid and nucleic acid sequences may 

include additional residues, such as additional N- or C-terminal amino acids or 5' 
or 3' sequences, and yet still be essentially as set forth in one of the sequences dis- 
closed herein, so long as the sequence meets the criteria set forth above, including 
the maintenance of biological protein activity where protein expression is con- 

30 cemed. The addition of terminal sequences particularly applies to nucleic acid se- 
quences that may, for example, include various non-coding sequences flanking ei- 
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ther of the 5' or 3' portions of the coding region or may include various internal 
sequences, i.e., introns, which are known to occur within genes. 

The nucleic acid segments of the present invention, regardless of the length 
of the coding sequence itself, may be combined with other DNA sequences, such 
5 as promoters, polyadenylation signals, additional restriction enzyme sites, multiple 
cloning sites, other coding segments, and the like, such that their overall length 
may vary considerably. It is therefore contemplated that a nucleic acid fragment of 
almost any length may be employed, with the total length preferably being limited 
by the ease of preparation and use in the intended recombinant DNA protocol. 

10 For example, nucleic acid fragments may be prepared that include a short 

contiguous stretch encoding the peptide sequence disclosed in SEQ ID NO:2, SEQ 
ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID 
NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID 
NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID 

1 5 NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID 
NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID 
NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID 
NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, SEQ ID 
NO:102, or SEQ ID NO: 108, or that are identical to or complementary to DNA se- 

20 quences which encode the peptide disclosed in SEQ ID NO:2, SEQ ID NO:4, SEQ 
IDNO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO:14, SEQ ID 
NO.16, SEQ ID N0.18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID 
NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID 

25 NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID 
NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID 
NO:66, SEQ ID NO:68, SEQ ID NO.70, SEQ ID NO: 1 00, SEQ ID NO: 1 02, or SEQ 
ID NO: 108, and particularly the DNA segments disclosed in SEQ ID NO:l, SEQ 
ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO.l 1, SEQ ID 

30 NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID 
NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID 
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NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID 
NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID 
NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID 
NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99 5 SEQ ID 
5 NO:l01,orSEQIDNO:107. 

Highly preferred nucleic acid segments of the present invention comprise 
one or more cry genes of the invention, or a portion of one or more cry genes of the 
invention. For certain application, relatively small contiguous nucleic acid se- 
quences are preferable, such as those which are about 14 or 15 or 16 or 17 or 18 or 

10 19, or 20, or 30-50, 51-80, 81-100 or so nucleotides in length. Alternatively, in 
some embodiments, and particularly those involving preparation of recombinant 
vectors, transformation of suitable host cells, and preparation of transgenic plant 
cell, longer nucleic acid segments are preferred, particularly those that include the 
entire coding region of one or more cry genes. As such, the preferred segments 

1 5 may include those that are up to about 20,000 or so nucleotides in length, or alter- 
natively, shorter sequences such as those about 19,000, about 18,000, about 
17,000, about 16,000, about 15,000, about 14,000, about 13,000, about 12,000, 
11,000, about 10,000, about 9,000, about 8,000, about 7,000, about 6,000, about 
5,000, about 4,500, about 4,000, about 3,500, about 3,000, about 2,500, about 

20 2,000, about 1,500, about 1,000, about 500, or about 200 or so base pairs in length. 
Of course, these numbers are not intended to be exclusionary of all possible inter- 
mediate lengths in the range of from about 20,000 to about 15 nucleotides, as all of 
these intermediate lengths are also contemplated to be useful, and fall within the 
scope of the present invention. It will be readily understood that "intermediate 

25 lengths", in these contexts, means any length between the quoted ranges, such as 
14, 15, 16, 17, 18, 19, 20, etc.; 21, 22, 23, 24, 25, 26, 27, 28, 29, etc.; 30, 31, 32, 

33, 34, 35, 36 etc.; 40, 41, 42, 43, 44 etc., 50, 51, 52, 53 etc.; 60, 61, 62, 

63.... etc., 70, 80, 90, 100, 110, 120, 130 etc.; 200, 210, 220, 230, 240, 250 

etc.; including all integers in the entire range from about 14 to about 10,000, in- 

30 eluding those integers in the ranges 200-500; 500-1 ,000; 1 ,000-2,000; 2,000-3,000; 
3,000-5,000 and the like. 
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In a preferred embodiment, the nucleic acid segments comprise a sequence 
of from about 1800 to about 18,000 base pair in length, and comprise one or more 
genes which encode a modified Cry3* polypeptide disclosed herein which has in- 
creased activity against Coleopteran insect pests, 
5 It will also be understood that this invention is not limited to the particular 

nucleic acid sequences which encode peptides of the present invention, or which 
encode the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 
SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, 
SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, 

10 SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, 
SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, 
SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, 
SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, 
SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, SEQ ID NO: 102, or SEQ ID 

15 NO: 108, including the DNA sequences which are particularly disclosed in SEQ ID 
NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, 
SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, 
SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, 
SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 

20 SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, 
SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, 
SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, 
SEQ ID NO: 101, or SEQ ID NO: 107. Recombinant vectors and isolated DNA 
segments may therefore variously include the peptide-coding regions themselves, 

25 coding regions bearing selected alterations or modifications in the basic coding re- 
gion, or they may encode larger polypeptides that nevertheless include these pep- 
tide-coding regions or may encode biologically functional equivalent proteins or 
peptides that have variant amino acids sequences. 

The DNA segments of the present invention encompass biologically- 

30 functional, equivalent peptides. Such sequences may arise as a consequence of 
codon redundancy and functional equivalency that are known to occur naturally 
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within nucleic acid sequences and the proteins thus encoded. Alternatively, func- 
tionally-equivalent proteins or peptides may be created via the application of re- 
combinant DNA technology, in which changes in the protein structure may be en- 
gineered^ based on considerations of the properties of the amino acids being ex- 
5 changed. Changes designed by man may be introduced through the application of 
site-directed mutagenesis techniques, e.g., to introduce improvements to the anti- 
genicity of the protein or to test mutants in order to examine activity at the molecu- 
lar level 

If desired, one may also prepare fusion proteins and peptides, e.g.. where 
10 the peptide-coding regions are aligned within the same expression unit with other 
proteins or peptides having desired functions, such as for purification or immu- 
nodetection purposes (e.g., proteins that may be purified by affinity chromatogra- 
phy and enzyme label coding regions, respectively). 

Recombinant vectors form further aspects of the present invention. Particu- 
1 5 larly useful vectors are contemplated to be those vectors in which the coding por- 
tion of the DNA segment, whether encoding a full length protein or smaller pep- 
tide, is positioned under the control of a promoter. The promoter may be in the 
form of the promoter that is naturally associated with a gene encoding peptides of 
the present invention, as may be obtained by isolating the 5' non-coding sequences 
20 located upstream of the coding segment or exon, for example, using recombinant 
cloning and/or PCR™ technology, in connection with the compositions disclosed 
herein. 



2.5 Vectors, Host Cells, and Protein Expression 

25 In other embodiments, it is contemplated that certain advantages will be 

gained by positioning the coding DNA segment under the control of a recombi- 
nant, or heterologous, promoter. As used herein, a recombinant or heterologous 
promoter is intended to refer to a promoter that is not normally associated with a 
DNA segment encoding a crystal protein or peptide in its natural environment. 

30 Such promoters may include promoters normally associated with other genes, 
and/or promoters isolated from any bacterial, viral, eukaryotic, or plant cell. Nam- 
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rally, it will be important to employ a promoter that effectively directs the expres- 
sion of the DNA segment in the cell type, organism, or even animal, chosen for 
expression. The use of promoter and cell type combinations for protein expression 
is generally known to those of skill in the art of molecular biology, for example, 
5 see Sambrook et al. 9 1989. The promoters employed may be constitutive, or in- 
ducible, and can be used under the appropriate conditions to direct high level ex- 
pression of the introduced DNA segment, such as is advantageous in the large- 
scale production of recombinant proteins or peptides. Appropriate promoter sys- 
tems contemplated for use in high-level expression include, but are not limited to, 

10 the Pichia expression vector system (Pharmacia LKB Biotechnology). 

In connection with expression embodiments to prepare recombinant pro- 
teins and peptides, it is contemplated that longer DNA segments will most often be 
used, with DNA segments encoding the entire peptide sequence being most pre- 
ferred. However, it will be appreciated that the use of shorter DNA segments to 

15 direct the expression of crystal peptides or epitopic core regions, such as may be 
used to generate anti-crystal protein antibodies, also falls within the scope of the 
invention. DNA segments that encode peptide antigens from about 8, 9, 10, or 1 1 
or so amino acids, and up to and including those of about 30, 40, or 50 or so amino 
acids in length, or more preferably, from about 8 to about 30 amino acids in length, 

20 or even more preferably, from about 8 to about 20 amino acids in length are con- 
templated to be particularly useful Such peptide epitopes may be amino acid se- 
quences which comprise contiguous amino acid sequence from SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, 
SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, 

25 SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, 
SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, 
SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, 
SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, 
SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, 

30 SEQ ID NO: 102, or SEQ ID NO: 108. 
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2.6 Transformed Host Cells and Transgenic Plants 

In one embodiment, the invention provides a transgenic plant having incor- 
porated into its genome a transgene that encodes a contiguous amino acid sequence 
selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6. 
5 SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14. SEQ ID NO:16, 
SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, 
SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, 
SEQ ID NO:38, SEQ ID NO:40, SEQ ID N0.42, SEQ ID NO:44, SEQ ID NO:46, 
SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, 
10 SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, 
SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:100, SEQ ID NO:102, and SEQ ID 
NO: 108. 

A further aspect of the invention is a transgenic plant having incorporated 
into its genome a cry3Bb* transgene, provided the transgene comprises a nucleic 

1 5 acid sequence selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO:3, 
SEQ ID NO:5. SEQ ID NO:7, SEQ ID N0.9, SEQ ID NO:l 1, SEQ ID NO:13. 
SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, 
SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, 
SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, 

20 SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, 
SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, 
SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID 
NO:101, and SEQ ID NO:107. Also disclosed and claimed are progeny of such a 
transgenic plant, as well as its seed, progeny from such seeds, and seeds arising 

25 from the second and subsequent generation plants derived from such a transgenic 
plant. 

The invention also discloses and claims host cells, both native, and geneti- 
cally engineered, which express the novel cry3Bb* genes to produce Cry3Bb* 
polypeptides. Preferred examples of bacterial host cells include B. thuringiensis 
30 EG11221, EG11222, EG11223, EG11224, EG11225, EG11226, EG11227, 
EG11228, EG11229, EG11230, EG11231, EG11232, EG11233, EG11234, 
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EG11235, EG11236, EG11237, EG11238, EG11239, EG11241, EG11242, 
EG11032, EG11035, EG11036, EG11046, EG11048, EG11051, EG11057, 
EG11058, EG 1 1081, EG i 1082, EG 1 1083, EG1 1084, EG1 1095, and EG1 1098. 

Methods of using such cells to produce Cry3* crystal proteins are also dis- 
5 closed. Such methods generally involve culturing the host cell (such as 
B. thuringiensis EG11221, EG11222, EG11223, EG11224, EG11225, EG11226, 
EG11227, EG11228, EG11229, EG11230, EG11231, EG11232, EG11233, 
EG11234, EG11235, EG11236, EG11237, EG11238, EG11239, EG11241, 
EG11242, EG11032, EG11035, EG11036, EG11046, EG11048, EG11051, 
10 EG11057, EG11058, EGI1081, EGI1082, EG11083, EG11084, or EG11095, or 
EG11098) under conditions effective to produce a Cry3* crystal protein, and ob- 
taining the Cry3* crystal protein from said cell. 

In yet another aspect, the present invention provides methods for producing 
a transgenic plant which expresses a nucleic acid segment encoding the novel re- 
15 combinant crystal proteins of the present invention. The process of producing 
transgenic plants is well-known in the art. In general, the method comprises trans- 
forming a suitable host cell with one or more DNA segments which contain one or 
more promoters operatively linked to a coding region that encodes one or more of 
the disclosed B. thuringiensis crystal proteins. Such a coding region is generally 
20 operatively linked to a transcription-terminating region, whereby the promoter is 
capable of driving the transcription of the coding region in the cell, and hence pro- 
viding the cell the ability to produce the recombinant protein in vivo. Alterna- 
tively, in instances where it is desirable to control, regulate, or decrease the amount 
of a particular recombinant crystal protein expressed in a particular transgenic cell, 
25 the invention also provides for the expression of crystal protein antisense mRNA. 
The use of antisense mRNA as a means of controlling or decreasing the amount of 
a given protein of interest in a cell is well-known in the art. 

Another aspect of the invention comprises a transgenic plant which express 
a gene or gene segment encoding one or more of the novel polypeptide composi- 
30 tions disclosed herein. As used herein, the term "transgenic plant" is intended to 
refer to a plant that has incorporated DNA sequences, including but not limited to 
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genes which are perhaps not normally present, DNA sequences not normally tran- 
scribed into RNA or translated into a protein ("expressed"), or any other genes or 
DNA sequences which one desires to introduce into the non-transformed plant, 
such as genes which may normally be present in the "non-transformed plant but 
5 which one desires to either genetically engineer or to have altered expression. 

It is contemplated that in some instances the genome of a transgenic plant 
of the present invention will have been augmented through the stable introduction 
of one or more Cry 3Bb* -encoding transgenes, either native, synthetically modi- 
fied, or mutated. In some instances, more than one transgene will be incorporated 
10 into the genome of the transformed host plant cell. Such is the case when more 
than one crystal protein-encoding DNA segment is incorporated into the genome of 
such a plant. In certain situations, it may be desirable to have one, two, three, four, 
or even more B. thuringiensis crystal proteins (either native or recombinantly- 
engineered) incorporated and stably expressed in the transformed transgenic plant. 
15 A preferred gene which may be introduced includes, for example, a crystal 

protein-encoding a DNA sequence from bacterial origin, and particularly one or 
more of those described herein which are obtained from Bacillus spp. Highly pre- 
ferred nucleic acid sequences are those obtained from B, thuringiensis, or any of 
those sequences which have been genetically engineered to decrease or increase the 

20 insecticidal activity of the crystal protein in such a transformed host cell. 

Means for transforming a plant cell and the preparation of a transgenic cell 
line are well-known in the art, and are discussed herein. Vectors, plasmids, cos- 
mids, YACs (yeast artificial chromosomes) and DNA segments for use in trans- 
forming such cells will, of course, generally comprise either the operons, genes, or 

25 gene-derived sequences of the present invention, either native, or synthetically- 
derived, and particularly those encoding the disclosed crystal proteins. These DNA 
constructs can further include structures such as promoters, enhancers, polylinkers, 
or even gene sequences which have positively- or negatively-regulating activity 
upon the particular genes of interest as desired. The DNA segment or gene may 

30 encode either a native or modified crystal protein, which will be expressed in the 
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resultant recombinant cells, and/or which will impart an improved phenotype to the 
regenerated plant 

Such transgenic plants may be desirable for increasing the insecticidal resis- 
tance of a monocotyledonous or dicotyledonous plant, by incorporating into such a 
5 plant, a transgenic DNA segment encoding a Cry3Bb* crystal protein which is 
toxic to coleopteran insects. Particularly preferred plants include grains such as 
corn, wheat, rye, rice, barley, and oats; legumes such as soybeans; tubers such as 
potatoes; fiber crops such as flax and cotton; turf and pasture grasses; ornamental 
plants; shrubs; trees; vegetables, berries, citrus, fruits, cacti, succulents, and other 
commercially-important crops including garden and houseplants. 

In a related aspect, the present invention also encompasses a seed produced 
by the transformed plant, a progeny from such seed, and a seed produced by the 
progeny of the original transgenic plant, produced in accordance with the above 
process. Such progeny and seeds will have one or more crystal protein trans- 
gene(s) stably incorporated into its genome, and such progeny plants will inherit 
the traits afforded by the introduction of a stable transgene in Mendelian fashion. 
All such transgenic plants having incorporated into their genome transgenic DNA 
segments encoding one or more Cry3Bb* crystal proteins or polypeptides are as- 
pects of this invention. Particularly preferred transgenes for the practice of the in- 
vention include nucleic acid segments comprising one or more crySBb* gene(s). 

2.7 Biological Functional Equivalents 

Modification and changes may be made in the structure of the peptides of 
the present invention and DNA segments which encode them and still obtain a 
functional molecule that encodes a protein or peptide with desirable characteristics. 
The following is a discussion based upon changing the amino acids of a protein to 
create an equivalent, or even an improved, second-generation molecule. In particu- 
lar embodiments of the invention, mutated crystal proteins are contemplated to be 
useful for increasing the insecticidal activity of the protein, and consequently in- 
creasing the insecticidal activity and/or expression of the recombinant transgene in 
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a plant cell. The amino acid changes may be achieved by changing the codons of 
the DNA sequence, according to the codons given in Table 4. 



Table 4 



Amino Acids Codons 



Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cvsteine 


Cys 


c 
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\j vj v»> 
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r lie 
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T TT )P 
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\Jiy viUC 




vJ 


gga 
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GGTT 
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Hie 

nis 
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PAP 


P AT T 










lbUlCUClllC 


Tip 
lie 


T 
X 


ATT A 


AT TP 


AT TT T 








I vein** 


T vc 


v- 

IS. 


AAA 


A Aft 










Leucine 


Leu 


L 


UUA 


UUG 


CUA 


cue 


CUG 


CUU 


Methionine 


Met 


M 


AUG 












Asparagine 


Asn 


N 


AAC 


AAU 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


ecu 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGU 


Serine 


Ser 


S 


AGC 


AGU 


UCA 


ucc 


UCG 


UCU 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACU 






Valine 


Val 


V 


GUA 


GUC 


GUG 


GUU 






Tryptophan 


Trp 


w 


UGG 












Tyrosine 


Tyr 


Y 


UAC 


UAU 











5 



For example, certain amino acids may be substituted for other amino acids 
in a protein structure without appreciable loss of interactive binding capacity with 
structures such as, for example, antigen-binding regions of antibodies or binding 
sites on substrate molecules. Since it is the interactive capacity and nature of a 
10 protein that defines that protein's biological functional activity, certain amino acid 
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sequence substitutions can be made in a protein sequence, and, of course, its under- 
lying DNA coding sequence, and nevertheless obtain a protein with like properties. 
It is thus contemplated by the inventors that various changes may be made in the 
peptide sequences of the disclosed compositions, or corresponding DNA sequences 
5 which encode said peptides without appreciable loss of their biological utility or 
activity. 

In making such changes, the hydropathic index of amino acids may be 
considered. The importance of the hydropathic amino acid index in conferring in- 
teractive biologic function on a protein is generally understood in the art (Kyte and 
10 Doolittle, 1982, incorporate herein by reference). It is accepted that the relative 
hydropathic character of the amino acid contributes to the secondary structure of 
the resultant protein, which in turn defines the interaction of the protein with other 
molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, 
and the like. 

15 Each amino acid has been assigned a hydropathic index on the basis of their 

hydrophobicity and charge characteristics (Kyte and Doolittle, 1982), these are: 
isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cyste- 
ine/cystine (+2.5); methionine (+1.9); alanine (+1.8); glycine (-0.4); threonine (- 
0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); proline (-1.6); histidine (- 
3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); asparagine (-3.5); lysine 
(-3.9); and arginine (-4.5), 

It is known in the art that certain amino acids may be substituted by other 
amino acids having a similar hydropathic index or score and still result in a protein 
with similar biological activity, Le. 9 still obtain a biological functionally equivalent 
protein. In making such changes, the substitution of amino acids whose hydro- 
pathic indices are within ±2 is preferred, those which are within ±1 are particularly 
preferred, and those within ±0.5 are even more particularly preferred. 

It is also understood in the art that the substitution of like amino acids can 
be made effectively on the basis of hydrophilicity. U. S. Patent 4,554,101, specifi- 
cally incorporated herein by reference, states that the greatest local average hydro- 
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philicity of a protein, as governed by the hydrophilicity of its adjacent amino acids, 
correlates with a biological property of the protein. 

As detailed in U. S. Patent 4,554,101, the following hydrophilicity values 
have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate 
5 (+3.0 ± 1); glutamate (+3.0 ± 1); serine (+0.3); asparagine (+0.2); glutamine 
(+0.2); glycine (0); threonine (-0.4); proline (-0.5 ± 1); alanine (-0.5); histidine (- 
0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (- 
1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). 

It is understood that an amino acid can be substituted for another having a 
10 similar hydrophilicity value and still obtain a biologically equivalent, and in par- 
ticular, an immunologically equivalent protein. In such changes, the substitution of 
amino acids whose hydrophilicity values are within ±2 is preferred, those which 
are within ±1 are particularly preferred, and those within ±0.5 are even more par- 
ticularly preferred. 

15 As outlined above, amino acid substitutions are generally therefore based 

on the relative similarity of the amino acid side-chain substituents, for example, 
their hydrophobicity, hydrophilicity, charge, size, and the like. Exemplary substi- 
tutions which take various of the foregoing characteristics into consideration are 
well known to those of skill in the art and include: arginine and lysine; glutamate 

20 and aspartate; serine and threonine; glutamine and asparagine; and valine, leucine 
and isoleucine. 

3.0 Brief Description of the Drawings 

The drawings form part of the present specification and are included to 
25 further demonstrate certain aspects of the present invention. The invention may be 
better understood by reference to one or more of these drawings in combination 
with the detailed description of specific embodiments presented herein. 

FIG. 1. Schematic representation of the monomeric structure of 
Cry3Bb. 

30 FIG. 2. Stereoscopic view of the monomeric structure of Cry3Bb 

with associated water molecules (represented by dots). 
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FIG. 3 A. Schematic representation of domain 1 of Cry3Bb 
FIG. 3B. Diagram of the positions of the 7 helices that comprise do- 
main 1. 

FIG. 4. Domain 1 of Cry3Bb is organized into seven a helices illus- 
5 trated in FIG. 3 A (schematic representation) and FIG. 3B (schematic diagram). 
The a helices and amino acids residues are shown. 

FIG. 5A. Schematic representation of domain 2 of Cry3Bb. 

FIG. 5B. Diagram of the positions of the 1 1 p strands that compose 
the 3 Psheets of domain 2. 
10 FIG. 6. Domain 2 of Ciy3Bb is a collection of three anti-parallel p 

sheets illustrated in FIG. 5. The amino acids that define these sheets is listed be- 
low (a8, amino aids 322-328, also is included in domain 2): 

FIG. 7A. Schematic representation of domain 3 of Cry3Bb. 

FIG. 7B. Diagram of the positions of the p strands that comprise do- 

15 main 3. 

FIG. 8. Domain 3 (FIG. 7) is a loosely organized collection of p 
strands and loops; no p sheets are present. The P stands contain the amino acids 
limited below: 

FIG. 9A. A "side" view of the dimeric structure of Cry3Bb. The heli- 
20 cal bundles of domains 1 can be seem in the middle of the molecule. 

FIG. 9B. A "top" view of the dimeric structure of Cry3Bb. The heli- 
cal bundles of domains 1 can be seem in the middle of the molecule. 

FIG. 10. A graphic representation of the growth in conductance with 
time of channels formed by Cry3A and Cry3Bb in planar lipid bilayers. Cry3A 
25 forms channels with higher conductances much more rapidly than Cry3Bb. 

FIG. 11. A map of pEG170I which contains the Cry3Bb gene with 
the cry IF terminator. 

FIG. 12. The results of replicated 1-dose assays against SCRW larvae 
of Cry3Bb proteins altered in the 1B2,3 region. 
30 FIG. 13. The results of replicated, 1-dose assays against SCRW lar- 

vae of Cry3Bb proteins altered in the 1B6, 7 region. 
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FIG. 14. The results of replicated, 1-dose screens against SCRW lar- 
vae of Cry3Bb proteins altered in the IB 10,1 1 region. 

FIG. 15. Single channel recordings of channels formed by 
Cry3Bb.ll230 and WT Cry3Bb in planar lipid bilayers. Cry3Bb.ll230 forms 
5 channels with well resolved open and closed states while Cry3Bb rarely does. 

FIG. 16. Single channel recordings of channels formed by Cry3Bb 
and Cry3Bb.60, a truzncated form of Cry3Bb. Cry3Bb.60 forms channels more 
quickly than Cry3Bb and, unlike Cry3Bb, produces channels with well resolved 
open and closed states. 
10 FIG. 17A. Sequence alignment of the amino acid sequence of Cry3A, 

Cry3B, and Cry3C. 

FIG. 17B. Shown is a continuation of alignment of the amino acid se- 
quence of Cry3A, Cry3B, and Cry3C shown in FIG. 17A. 

FIG. 17C. Shown is a continuation of alignment of the amino acid se- 
1 5 quence of Cry3 A, Cry3B, and Cry3C shown in FIG. 1 7A. 

4.0 Description of Illustrative Embodiments 

The invention defines new B. thuringiensis (Bt) insecticidal 5-endotoxin 
proteins and the biochemical and biophysical strategies used to design the new 

20 proteins. Delta-endotoxins are a class of insecticdal proteins produced by 
B. thuringiensis that form cation-selective channels in planar lipid bilayers 
(English and Slatin, 1992). The new 5-endotoxins are based on the parent structure 
of the coleopteran-active, 6-endotoxin Cry3Bb. Like other members of the coleop- 
teran-active class of S-endotoxins, including Cry3A and Cry3B, Cry3Bb exhibits 

25 excellent insecticidal activity against the Colorado Potato Beetle {Leptinotarsa de- 
cemlineata). However, unlike Cry3A and Cry3B, Cry3Bb is also active against the 
southern corn rootworm or SCRW (Diabrotica undecimpunctata howardi Barber) 
and the western corn rootworm or WCRW (Diabrotica virgifera virgifera Le- 
Conte). The new insecticidal proteins described herein were specifically designed 

30 to improve the biological activity of the parent Cry3Bb protein. In addition, the 
design strategies themselves are novel inventions capable of being applied to and 
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improving B. thuringiensis 8-endotoxins in general. B. thuringiensis 8-endotoxins 
are also members of a larger class of bacterial toxins that form ion channels (see 
English and Slatin 1992, for a review). The inventors, therefore, believe that these 
design strategies can also be applied to any biologically active, channel-forming 
5 protein to improve its biological properties. 

The designed Cry3Bb proteins were engineered using one or more of the 
following strategies including (1) identification and alteration of protease-sensitive 
sites and proteolytic processing; (2) analysis and manipulation of bound water; (3) 
manipulation of hydrogen bonds around mobile regions; (4) loop analysis and loop 

10 redesign around flexible helices; (5) loop design around p strands and p sheets; (6) 
identification and redesign of complex electrostatic surfaces; (7) identification and 
removal of metal binding sites; (8) alteration of quaternary structure; (9) identifi- 
cation and design of structural residues; and (10) combinations of any and all sites 
defined by strategies 1 -9. These design strategies permit the identification and re- 

15 design of specific sites on Cry3Bb, ultimately creating new proteins with improved 
insecticidal activities. These new proteins are designated Cry3Bb designed proteins 
and are named Cry3Bb followed by a period and a suffix (e.g., Cry3Bb.60, 
Cry3Bb.ll231). The new proteins are listed in Table 2 along with the specific 
sites on the molecule that were modified, the amino-acid sequence changes at those 
20 sites that improve biological activity, the improved insecticidal activities and the 
design method used to identify that specific site. 

4.1 Some Advantages of the Invention 

Mutagenesis studies with cry genes have failed to identify a significant 
25 number of mutant crystal proteins which have improved broad-spectrum insecti- 
cidal activity, that is, with improved toxicity towards a range of insect pest species. 
Since agricultural crops are typically threatened by more than one insect pest spe- 
cies at any given time, desirable mutant crystal proteins are preferably those that 
exhibit improvements in toxicity towards multiple insect pest species. Previous 
30 failures to identify such mutants may be attributed to the choice of sites targeted 
for mutagenesis. For example, with respect to the related protein, CrylC, sites 
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within domain 2 and domain 3 have been the principal targets of mutagenesis ef- 
forts, primarily because these domains are believed to be important for receptor 
binding and in determining insecticidal specificity (Aronson et aL, 1995; Chen et 
al 1993; de Maagd et aL, 1996; Lee et al y 1992; Lee etaL, 1995; Lu et aL, 1994; 
5 Smedley and Ellar, 1996; Smith and Ellar, 1994; Rajamohan et aL, 1995; Rajamo- 
han etaL, 1996) 

In contrast, the present inventors reasoned that the toxicity of Cry3 pro- 
teins, and specifically the toxicity of the Cry3Bb protein, may be improved against 
a broader array of target pests by targeting regions involved in ion channel function 

10 rather than regions of the molecule directly involved in receptor interactions, 
namely domains 2 and 3. Accordingly, the inventors opted to target regions within 
domain 1 of Cry3Bb for mutagenesis for the purpose of isolating Cry3Bb mutants 
with improved broad spectrum toxicity. Indeed, in the present invention, Cry3Bb 
mutants are described that show improved toxicity towards several coleopteran 

15 pests. 

At least one, and probably more than one, a helix of domain 1 is involved 
in the formation of ion channels and pores within the insect midgut epithelium 
(Gazit and Shai, 1993; Gazit and Shai, 1995). Rather than target for mutagenesis 
the sequences encoding the a helices of domain 1 as others have (Wu and Aronson, 

20 1992; Aronson et al., 1995; Chen et al, 1995), the present inventors opted to target 
exclusively sequences encoding amino acid residues adjacent to or lying within the 
predicted loop regions of Cry3Bb that separate these a helices. Amino acid resi- 
dues within these loop regions or amino acid residues capping the end of an a helix 
and lying adjacent to these loop regions may affect the spatial relationships among 

25 these a helices. Consequently, the substitution of these amino acid residues may 
result in subtle changes in tertiary structure, or even quaternary structure, that 
positively impact the function of the ion channel. Amino acid residues in the loop 
regions of domain 1 are exposed to the solvent and thus are available for various 
molecular interactions. Altering these amino acids could result in greater stability 

30 of the protein by eliminating or occluding protease-sensitive sites. Amino acid 
substitutions that change the surface charge of domain 1 could alter ion channel 
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efficiency or alter interactions with the brush border membrane or with other por- 
tions of the toxin molecule, allowing binding or insertion to be more effective. 

According to this invention, base substitutions are made in the underlying 
cry3Bb nucleic acid residues in order to change particular codons of the corre- 
5 sponding polypeptides, and particularly, in those loop regions between ot-helices. 
The insecticidal activity of a crystal protein ultimately dictates the level of crystal 
protein required for effective insect control. The potency of an insecticidal protein 
should be maximized as much as possible in order to provide for its economic and 
efficient utilization in the field. The increased potency of an insecticidal protein in 

1 0 a bioinsecticide formulation would be expected to improve the field performance 
of the bioinsecticide product. Alternatively, increased potency of an insecticidal 
protein in a bioinsecticide formulation may promote use of reduced amounts of 
bioinsecticide per unit area of treated crop, thereby allowing for more cost- 
effective use of the bioinsecticide product. When expressed in planta, the produc- 

15 tion of crystal proteins with improved insecticidal activity can be expected to im- 
prove plant resistance to susceptible insect pests. 

4.2 Methods for Culturing B. thuringiensis to Produce Crystal 
Proteins 

20 The B. thuringiensis strains described herein may be cultured using stan- 

dard known media and fermentation techniques. Upon completion of the fermen- 
tation cycle, the bacteria may be harvested by first separating the B. thuringiensis 
spores and crystals from the fermentation broth by means well known in the art. 
The recovered B. thuringiensis spores and crystals can be formulated into a wetta- 

25 ble powder, a liquid concentrate, granules or other formulations by the addition of 
surfactants, dispersants, inert carriers and other components to facilitate handling 
and application for particular target pests. The formulation and application proce- 
dures are all well known in the art. 
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4.3 Recombinant Host Cells For Expression of cry* Genes 

The nucleotide sequences of the subject invention can be introduced into a 
wide variety of microbial hosts. Expression of the toxin gene results, directly or 
indirectly, in the intracellular production and maintenance of the pesticide. With 
5 suitable hosts, e.g., Pseudomonas, the microbes can be applied to the sites of cole- 
opteran insects where they will proliferate and be ingested by the insects. The re- 
sult is a control of the unwanted insects. Alternatively, the microbe hosting the 
toxin gene can be treated under conditions that prolong the activity of the toxin 
produced in the cell. The treated cell then can be applied to the environment of 
10 target pest(s). The resulting product retains the toxicity of the B. thuringiensis 
toxin. 

Suitable host cells, where the pesticide-containing cells will be treated to 
prolong the activity of the toxin in the cell when the then treated cell is applied to 
the environment of target pest(s), may include either prokaryotes or eukaryotes, 

15 normally being limited to those cells which do not produce substances toxic to 
higher organisms, such as mammals. However, organisms which produce sub- 
stances toxic to higher organisms could be used, where the toxin is unstable or the 
level of application sufficiently low as to avoid any possibility or toxicity to a 
mammalian host. As hosts, of particular interest will be the prokaryotes and the 

20 lower eukaryotes, such as fungi. Illustrative prokaryotes, both Gram-negative and 
Gram-positive, include Enter -obacteriaceae, such as Escherichia, Erwinia, Shi- 
gella, Salmonella, and Proteus; Bacillaceae; Rhizobiceae, such as Rhizobium; 
Spirillaceae t such as photobacterium, Zymomonas, Serratia, Aeromonas, Vibrio, 
Desulfovibrio, Spirillum; Lactobacillaceae; Pseudomonadaceae, such as Pseudo- 

25 monas and Acetobacter; Azotobacteraceae, Actinomycetales, and Nitrobacter- 
aceae. Among eukaryotes are fungi, such as Phycomycetes and Ascomycetes, 
which includes yeast, such as Saccharomyces and Schizosaccharomyces; and 
Basidiomycetes yeast, such as Rhodotorula, Aureobasidium, Sporobolomyces, and 
the like. 

30 Characteristics of particular interest in selecting a host cell for purposes of 

production include ease of introducing the B. thuringiensis gene into the host, 
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availability of expression systems, efficiency of expression, stability of the pesti- 
cide in the host, and the presence of auxiliary genetic capabilities. Characteristics 
of interest for use as a pesticide microcapsule include protective qualities for the 
pesticide, such as thick cell walls, pigmentation, and intracellular packaging or 
5 formation of inclusion bodies; leaf affinity; lack of mammalian toxicity; attractive- 
ness to pests for ingestion; ease of killing and fixing without damage to the toxin; 
and the like. Other considerations include ease of formulation and handling, eco- 
nomics, storage stability, and the like. 

Host organisms of particular interest include yeast, such as Rhodotorula 
10 sp., Aureobasidium sp., Saccharomyces sp, and Sporobolomyces sp.; phylloplane 
organisms such as Pseudomonas sp., Erwinia sp. and Flavobacterium sp; or such 
other organisms as Escherichia, Lactobacillus sp, Bacillus sp., Streptomyces sp, 
and the like. Specific organisms include Pseudomonas aeruginosa, Pseudomonas 
fluorescens, Saccharomyces cerevisiae, B. thuringiensis, Escherichia coli t B. sub- 
15 tilis, B. megaterium, B. cereus, Streptomyces lividans and the like. 

Treatment of the microbial cell, e.g., a microbe containing the 
B. thuringiensis toxin gene, can be by chemical or physical means, or by a combi- 
nation of chemical and/or physical means, so long as the technique does not dele- 
teriously affect the properties of the toxin, nor diminish the cellular capability in 
protecting the toxin. Examples of chemical reagents are halogenating agents, par- 
ticularly halogens of atomic no. 17-80. More particularly, iodine can be used un- 
der mild conditions and for sufficient time to achieve the desired results. Other 
suitable techniques include treatment with aldehydes, such as formaldehyde and 
glutaraldehye; anti-infectives, such as zephiran chloride and cetylpyridinium chlo- 
ride; alcohols, such as isopropyl and ethanol; various histologic fixatives, such as 
Lugol's iodine, Bouin's fixative, and Helly's fixatives, (see e.g., Humason, 1967); 
or a combination of physical (heat) and chemical agents that preserve and prolong 
the activity of the toxin produced in the cell when the cell is administered to the 
host animal. Examples of physical means are short wavelength radiation such as y- 
radiation and X-radiation, freezing, UV irradiation, lyophilization, and the like. 
The cells employed will usually be intact and be substantially in the proliferative 
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form when treated, rather than in a spore form, although in some instances spores 
may be employed. 

Where the B. thuringiensis toxin gene is introduced via a suitable vector 
into a microbial host, and said host is applied to the environment in a living state, it 
5 is essential that certain host microbes be used. Microorganism hosts are selected 
which are known to occupy the "phytosphere" (phylloplane, phyllosphere, rhi- 
zosphere, and/or rhizoplane) of one or more crops of interest. These microorgan- 
isms are selected so as to be capable of successfully competing in the particular 
environment (crop and other insect habitats) with the wild-type microorganisms, 

10 provide for stable maintenance and expression of the gene expressing the polypep- 
tide pesticide, and, desirably, provide for improved protection of the pesticide from 
environmental degradation and inactivation. 

A large number of microorganisms are known to inhabit the phylloplane 
(the surface of the plant leaves) and/or the rhizosphere (the soil surrounding plant 

15 roots) of a wide variety of important crops. These microorganisms include bacte- 
ria, algae, and fungi. Of particular interest are microorganisms, such as bacteria, 
e.g., genera Bacillus (including the species and subspecies B. thuringiensis kurstaki 
HD-1, B. thuringiensis kurstaki HD-73, B. thuringiensis sotto, B. thuringiensis 
berliner, B. thuringiensis thuringiensis, B. thuringiensis tolworthi, B. thuringiensis 

20 dendrolimus, B. thuringiensis alesti, B. thuringiensis galleriae, B. thuringiensis 
aizawai, B. thuringiensis subtoxicus, B. thuringiensis entomocidus, B. thuringiensis 
tenebrionis and B. thuringiensis san diego); Pseudomonas, Erwinia, Serratia, 
Klebsiella, Zanthomonas, Streptomyces, Rhizobium, Rhodopseudomonas, Methy- 
lophilius, Agrobacterium, Acetobacter, Lactobacillus, Arthrobacter, Azotobacter, 

25 Leuconostoc, and Alcaligenes; fungi, particularly yeast, e.g., genera Saccharomy- 
ces, Cryptococcus, Kluyveromyces, Sporobolomyces, Rhodotorula, and Aureo- 
basidium. Of particular interest are such phytosphere bacterial species as Pseudo- 
monas syringae, Pseudomonas fluorescens, Serratia marcescens, Acetobacter 
xylinum, Agrobacterium tumefaciens, Rhodobacter sphaeroides, Xanthomonas 

30 campestris, Rhizobium melioti, Alcaligenes eutrophus, and Azotobacter vinlandii; 
and phytosphere yeast species such as Rhodotorula rubra, R. glutinis, R. marina, 
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R. aurantiaca, Cryptococcus albidus, C. diffluens, C laurentii, Saccharomyces 
rosei, S. pretoriensis, S. cerevisiae, Sporobolomyces roseus, S. odorus, Kluyvero- 
myces veronae, and Aureobasidium pollulans, 

5 4.4 Definitions 

In accordance with the present invention, nucleic acid sequences include 
and are not limited to DNA (including and not limited to genomic or extragenomic 
DNA), genes, RNA (including and not limited to mRNA and tRNA), nucleosides, 
and suitable nucleic acid segments either obtained from native sources, chemically 
10 synthesized, modified, or otherwise prepared by the hand of man. The following 
words and phrases have the meanings set forth below. 

A, an: In accordance with long standing patent law convention, the words 
"a" and "an" when used in this application, including the claims, denotes "one or 
more". 

1 5 Broad-spectrum: Refers to a wide range of insect species. 

Broad-spectrum activity: The toxicity towards a wide range of insect 
species. 

Expression: The combination of intracellular processes, including tran- 
scription and translation undergone by a coding DNA molecule such as a structural 
20 gene to produce a polypeptide. 

Insecticidal activity: The toxicity towards insects. 

Insecticidal specificity: The toxicity exhibited by a crystal protein or pro- 
teins, microbe or plant, towards multiple insect species. 

Intraorder specificity: The toxicity of a particular crystal protein towards 
25 insect species within an Order of insects {e.g., Order Coleoptera). 

Interorder specificity: The toxicity of a particular crystal protein towards 
insect species of different Orders {e.g., Orders Coleoptera and Diptera). 

LC 50 : The lethal concentration of crystal protein that causes 50% mortality 
of the insects treated. 

30 LC 95 : The lethal concentration of crystal protein that causes 95% mortality 

of the insects treated. 
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Promoter: A recognition site on a DNA sequence or group of DNA 
sequences that provide an expression control element for a structural gene and to 
which RNA polymerase specifically binds and initiates RNA synthesis 
(transcription) of that gene. 
5 Regeneration: The process of growing a plant from a plant cell (e.g., plant 

protoplast or explant). 

Structural gene: A gene that is expressed to produce a polypeptide. 

Transformation: A process of introducing an exogenous DNA sequence 
(e.g., a vector, a recombinant DNA molecule) into a cell or protoplast in which that 
10 exogenous DNA is incorporated into a chromosome or is capable of autonomous 
replication. 

Transformed cell: A cell whose DNA has been altered by the introduction 
of an exogenous DNA molecule into that cell. 

Transgenic cell: Any cell derived or regenerated from a transformed cell 
15 or derived from a transgenic cell. Exemplary transgenic cells include plant calli 
derived from a transformed plant cell and particular cells such as leaf, root, stem, 
e.g., somatic cells, or reproductive (germ) cells obtained from a transgenic plant. 

Transgenic plant: A plant or progeny thereof derived from a transformed 
plant cell or protoplast, wherein the plant DNA contains an introduced exogenous 
20 DNA molecule not originally present in a native, non-transgenic plant of the same 
strain. The terms "transgenic plant" and "transformed plant" have sometimes been 
used in the art as synonymous terms to define a plant whose DNA contains an ex- 
ogenous DNA molecule. However, it is thought more scientifically correct to refer 
to a regenerated plant or callus obtained from a transformed plant cell or protoplast 
25 as being a transgenic plant, and that usage will be followed herein. 

Vector: A DNA molecule capable of replication in a host cell and/or to 
which another DNA segment can be operatively linked so as to bring about repli- 
cation of the attached segment. A plasmid is an exemplary vector. 

As used herein, the designations "CrylH" and "Cry3" are synonymous, as 
30 are the designations "CryllIB2" and "Cry3Bb." Likewise, the inventors have util- 
ized the generic term Cry3Bb* to denote any and all Cry3Bb variants which com- 
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prise amino acid sequences modified in the protein. Similarly, cry3Bb* is meant to 
denote any and all nucleic acid segments and/or genes which encode a Cry3Bb* 
protein, etc. 

5 4.5 Preparation of cry3* Polynucleotides 

Once the structure of the desired peptide to be mutagenized has been ana- 
lyzed using one or more of the design strategies disclosed herein, it will be desir- 
able to introduce one or more mutations into either the protein or, alternatively, 
into the DNA sequence encoding the protein for the purpose of producing a mu- 

1 0 tated protein with altered bioinsecticidal properties. 

To that end, the present invention encompasses both site-specific 
mutagenesis methods and random mutagenesis of a nucleic acid segment encoding 
a crystal protein in the manner described herein. In particular, methods are dis- 
closed for the mutagenesis of nucleic acid segments encoding the amino acid se- 

15 quences using one or more of the design strategies described herein. Using the as- 
say methods described herein, one may then identify mutants arising from these 
procedures which have improved insecticidal properties or altered specificity, ei- 
ther intraorder or interorder. 

The means for mutagenizing a DNA segment encoding a crystal protein are 

20 well-known to those of skill in the art. Modifications may be made by random, or 
site-specific mutagenesis procedures. The nucleic acid may be modified by alter- 
ing its structure through the addition or deletion of one or more nucleotides from 
the sequence. 

Mutagenesis may be performed in accordance with any of the techniques 
25 known in the art such as and not limited to synthesizing an oligonucleotide having 
one or more mutations within the sequence of a particular crystal protein. A 
"suitable host" is any host which will express Cry3Bb, such as and not limited to 
B. thuringiensis and E. coli. Screening for insecticidal activity, in the case of 
Cry3Bb includes and is not limited to coleopteran-toxic activity which may be 
30 screened for by techniques known in the art. 
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In particular, site-specific mutagenesis is a technique useful in the prepara- 
tion of individual peptides, or biologically functional equivalent proteins or pep- 
tides, through specific mutagenesis of the underlying DNA. The technique further 
provides a ready ability to prepare and test sequence variants, for example, incor- 
5 porating one or more of the foregoing considerations, by introducing one or more 
nucleotide sequence changes into the DNA. Site-specific mutagenesis allows the 
production of mutants through the use of specific oligonucleotide sequences which 
encode the DNA sequence of the desired mutation, as well as a sufficient number 
of adjacent nucleotides, to provide a primer sequence of sufficient size and se- 
lf) quence complexity to form a stable duplex on both sides of the deletion junction 
being traversed. Typically, a primer of about 1 7 to about 75 nucleotides or more in 
length is preferred, with about 10 to about 25 or more residues on both sides of the 
junction of the sequence being altered. 

In general, the technique of site-specific mutagenesis is well known in the 
15 art, as exemplified by various publications. As will be appreciated, the technique 
typically employs a phage vector which exists in both a single stranded and double 
stranded form. Typical vectors useful in site-directed mutagenesis include vectors 
such as the Ml 3 phage. These phage are readily commercially available and their 
use is generally well known to those skilled in the art. Double stranded plasmids 
20 are also routinely employed in site directed mutagenesis which eliminates the step 
of transferring the gene of interest from a plasmid to a phage. 

In general, site-directed mutagenesis in accordance herewith is performed 
by first obtaining a single-stranded vector or melting apart of two strands of a 
double stranded vector which includes within its sequence a DNA sequence which 
25 encodes the desired peptide. An oligonucleotide primer bearing the desired mu- 
tated sequence is prepared, generally synthetically. This primer is then annealed 
with the single-stranded vector, and subjected to DNA polymerizing enzymes such 
as £. coli polymerase I Klenow fragment, in order to complete the synthesis of the 
mutation-bearing strand. Thus, a heteroduplex is formed wherein one strand en- 
30 codes the original non-mutated sequence and the second strand bears the desired 
mutation. This heteroduplex vector is then used to transform or transfect appro- 
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priate cells, such as E. coli cells, and clones are selected which include recombi- 
nant vectors bearing the mutated sequence arrangement. A genetic selection 
scheme was devised by Kunkel et al (1987) to enrich for clones incorporating the 
mutagenic oligonucleotide. Alternatively, the use of PCR™ with commercially 
5 available thermostable enzymes such as Taq polymerase may be used to incorpo- 
rate a mutagenic oligonucleotide primer into an amplified DNA fragment that can 
then be cloned into an appropriate cloning or expression vector. The PCR™- 
mediated mutagenesis procedures of Tomic et al (1990) and Upender et al (1995) 
provide two examples of such protocols. A PCR™ employing a thermostable li- 
10 gase in addition to a thermostable polymerase may also be used to incorporate a 
phosphorylated mutagenic oligonucleotide into an amplified DNA fragment that 
may then be cloned into an appropriate cloning or expression vector. The 
mutagenesis procedure described by Michael (1994) provides an example of one 
such protocol. 

The preparation of sequence variants of the selected peptide-encoding DNA 
segments using site-directed mutagenesis is provided as a means of producing po- 
tentially useful species and is not meant to be limiting as there are other ways in 
which sequence variants of peptides and the DNA sequences encoding them may 
be obtained. For example, recombinant vectors encoding the desired peptide se- 
quence may be treated with mutagenic agents, such as hydroxylamine, to obtain 
sequence variants. 

As used herein, the term "oligonucleotide directed mutagenesis procedure" 
refers to template-dependent processes and vector-mediated propagation which re- 
sult in an increase in the concentration of a specific nucleic acid molecule relative 
to its initial concentration, or in an increase in the concentration of a detectable 
signal, such as amplification. As used herein, the term "oligonucleotide directed 
mutagenesis procedure" is intended to refer to a process that involves the tem- 
plate-dependent extension of a primer molecule. The term template dependent 
process refers to nucleic acid synthesis of an RNA or a DNA molecule wherein the 
sequence of the newly synthesized strand of nucleic acid is dictated by the 
well-known rules of complementary base pairing (see, for example, Watson, 1 987). 
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Typically, vector mediated methodologies involve the introduction of the nucleic 
acid fragment into a DNA or RNA vector, the clonal amplification of the vector, 
and the recovery of the amplified nucleic acid fragment. Examples of such meth- 
odologies are provided by U. S. Patent 4,237,224, specifically incorporated herein 
5 by reference in its entirety 

A number of template dependent processes are available to amplify the tar- 
get sequences of interest present in a sample. One of the best known amplification 
methods is the polymerase chain reaction (PCR™) which is described in detail in 
U. S. Patents 4,683,195, 4,683,202 and 4,800,159 (each of which is specifically 
1 0 incorporated herein by reference in its entirety). Briefly, in PCR™, two primer se- 
quences are prepared which are complementary to regions on opposite complemen- 
tary strands of the target sequence. An excess of deoxynucleoside triphosphates 
are added to a reaction mixture along with a DNA polymerase (e.g., Taq po- 
lymerase). If the target sequence is present in a sample, the primers will bind to 

1 5 the target and the polymerase will cause the primers to be extended along the target 
sequence by adding on nucleotides. By raising and lowering the temperature of the 
reaction mixture, the extended primers will dissociate from the target to form reac- 
tion products, excess primers will bind to the target and to the reaction products 
and the process is repeated. Preferably a reverse transcriptase PCR™ amplification 

20 procedure may be performed in order to quantify the amount of mRNA amplified. 
Polymerase chain reaction methodologies are well known in the art. 

Another method for amplification is the ligase chain reaction (referred to as 
LCR), disclosed in Eur. Pat. Appl. Publ. No. 320,308, incorporated herein by refer- 
ence in its entirety . In LCR, two complementary probe pairs are prepared, and in 

25 the presence of the target sequence, each pair will bind to opposite complementary 
strands of the target such that they abut. In the presence of a ligase, the two probe 
pairs will link to form a single unit. By temperature cycling, as in PCR™, bound 
ligated units dissociate from the target and then serve as "target sequences" for 
ligation of excess probe pairs. U. S. Patent 4,883,750, specifically incorporated 

30 herein by reference in its entirety, describes an alternative method of amplification 
similar to LCR for binding probe pairs to a target sequence. 
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Qbeta Replicase™, described in Intl. Pat. Appl. Publ. No. 
PCT/US87/00880, incorporated herein by reference in its entirety, may also be 
used as still another amplification method in the present invention. In this method, 
a replicative sequence of RNA which has a region complementary to that of a tar- 
5 get is added to a sample in the presence of an RNA polymerase. The polymerase 
will copy the replicative sequence which can then be detected. 

An isothermal amplification method, in which restriction endonucleases 
and ligases are used to achieve the amplification of target molecules that contain 
nucleotide 5'-[ot-thio]triphosphates in one strand of a restriction site (Walker et al, 
10 1992, incorporated herein by reference in its entirety), may also be useful in the 
amplification of nucleic acids in the present invention. 

Strand Displacement Amplification (SDA) is another method of carrying 
out isothermal amplification of nucleic acids which involves multiple rounds of 
strand displacement and synthesis, i.e., nick translation. A similar method, called 

15 Repair Chain Reaction (RCR) is another method of amplification which may be 
useful in the present invention and is involves annealing several probes throughout 
a region targeted for amplification, followed by a repair reaction in which only two 
of the four bases are present. The other two bases can be added as biotinylated de- 
rivatives for easy detection. A similar approach is used in SDA 

20 Sequences can also be detected using a cyclic probe reaction (CPR). In 

CPR, a probe having 3' and 5' end sequences of non-Cry-specific DNA and an in- 
ternal sequence of a Cry-specific RNA is hybridized to DNA which is present in a 
sample. Upon hybridization, the reaction is treated with RNaseH, and the products 
of the probe identified as distinctive products generating a signal which are re- 

25 leased after digestion. The original template is annealed to another cycling probe 
and the reaction is repeated. Thus, CPR involves amplifying a signal generated by 
hybridization of a probe to a cry-specific expressed nucleic acid 

Still other amplification methods described in Great Britain Pat. Appl. No. 
2 202 328, and in Intl. Pat. Appl. Publ. No. PCT/US89/01025, each of which is in- 

30 corporated herein by reference in its entirety, may be used in accordance with the 
present invention. In the former application, "modified" primers are used in a 
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PCR™ like, template and enzyme dependent synthesis. The primers may be 
modified by labeling with a capture moiety (e.g., biotin) and/or a detector moiety 
(e.g., enzyme). In the latter application, an excess of labeled probes are added to a 
sample: In the presence of the target sequence, the probe binds and is cleaved 
5 catalytically. After cleavage, the target sequence is released intact to be bound by 
excess probe. Cleavage of the labeled probe signals the presence of the target se- 
quence 

Other nucleic acid amplification procedures include transcription-based 
amplification systems (TAS) (Kwoh et al t 1989; IntL Pat. Appl. Publ. No. WO 
10 88/10315, incorporated herein by reference in its entirety), including nucleic acid 
sequence based amplification (NASBA) and 3SR. In NASBA, the nucleic acids 
can be prepared for amplification by standard phenol/chloroform extraction, heat 
denaturation of a sample, treatment with lysis buffer and minispin columns for 
isolation of DNA and RNA or guanidinium chloride extraction of RNA. These 

15 amplification techniques involve annealing a primer which has crystal protein- 
specific sequences. Following polymerization, DNA/RNA hybrids are digested 
with RNasc H while double stranded DNA molecules are heat denatured again. In 
either case the single stranded DNA is made fully double stranded by addition of 
second crystal protein-specific primer, followed by polymerization. The double 

20 stranded DNA molecules are then multiply transcribed by a polymerase such as T7 
or SP6. In an isothermal cyclic reaction, the RNAs are reverse transcribed into 
double stranded DNA, and transcribed once against with a polymerase such as T7 
or SP6. The resulting products, whether truncated or complete, indicate crystal 
protein-specific sequences. 

25 Eur. Pat. Appl. Publ. No. 329,822, incorporated herein by reference in its 

entirety, disclose a nucleic acid amplification process involving cyclically synthe^ 
sizing single-stranded RNA ("ssRNA"), ssDNA, and double-stranded DNA 
(dsDNA), which may be used in accordance with the present invention. The 
ssRNA is a first template for a first primer oligonucleotide, which is elongated by 

30 reverse transcriptase (RNA-dependent DNA polymerase). The RNA is then re- 
moved from resulting DNA.RNA duplex by the action of ribonuclease H (RNase 
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H, an RNase specific for RNA in a duplex with either DNA or RNA). The resul- 
tant ssDNA is a second template for a second primer, which also includes the se- 
quences of an RNA polymerase promoter (exemplified by T7 RNA polymerase) 5' 
to its homology to its template. This primer is then extended by DNA polymerase 
5 (exemplified by the large "Klenow" fragment of E. coli DNA polymerase I), result- 
ing as a double-stranded DNA ("dsDNA") molecule, having a sequence identical to 
that of the original RNA between the primers and having additionally, at one end, a 
promoter sequence. This promoter sequence can be used by the appropriate RNA 
polymerase to make many RNA copies of the DNA. These copies can then 

10 re-enter the cycle leading to very swift amplification. With proper choice of en- 
zymes, this amplification can be done isothermally without addition of enzymes at 
each cycle. Because of the cyclical nature of this process, the starting sequence can 
be chosen to be in the form of either DNA or RNA 

Intl. Pat. Appl. Publ. No. WO 89/06700, incorporated herein by reference in 

1 5 its entirety, disclose a nucleic acid sequence amplification scheme based on the hy- 
bridization of a promoter/primer sequence to a target single-stranded DNA 
("ssDNA") followed by transcription of many RNA copies of the sequence. This 
scheme is not cyclic; new templates are not produced from the resultant RNA 
transcripts. Other amplification methods include "RACE" (Frohman, 1990), and 

20 "one-sided PCR™" (Ohara, 1 989) which are well-known to those of skill in the art. 

Methods based on ligation of two (or more) oligonucleotides in the pres- 
ence of nucleic acid having the sequence of the resulting "di-oligonucleotide", 
thereby amplifying the di-oligonucleotide (Wu and Dean, 1996, incorporated 
herein by reference in its entirety), may also be used in the amplification of DNA 

25 sequences of the present invention. 

4.6 Phage-Resistant Variants 

In certain embodiments, one may desired to prepare one or more phage re- 
sistant variants of the B. thuringiensis mutants prepared by the methods described 
30 herein. To do so, an aliquot of a phage lysate is spread onto nutrient agar and al- 
lowed to dry. An aliquot of the phage sensitive bacterial strain is then plated di- 
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rectly over the dried lysate and allowed to dry. The plates are incubated at 30°C. 
The plates are incubated for 2 days and, at that time, numerous colonies could be 
seen growing on the agar. Some of these colonies are picked and subcultured onto 
nutrient agar plates. These apparent resistant cultures are tested for resistance by 
5 cross streaking with the phage lysate. A line of the phage lysate is streaked on the 
plate and allowed to dry. The presumptive resistant cultures are then streaked 
across the phage line. Resistant bacterial cultures show no lysis anywhere in the 
streak across the phage line after overnight incubation at 30°C. The resistance to 
phage is then reconfirmed by plating a lawn of the resistant culture onto a nutrient 
10 agar plate. The sensitive strain is also plated in the same manner to serve as the 
positive control. After drying, a drop of the phage lysate is plated in the center of 
the plate and allowed to dry. Resistant cultures showed no lysis in the area where 
the phage lysate has been placed after incubation at 30°C for 24 hours. 

15 4.7 Crystal Protein Compositions As Insecticides and Methods of 
Use 

Order Coleoptera comprises numerous beetle species including ground 
beetles, reticulated beetles, skin and larder beetles, long-horned beetles, leaf bee- 
tles, weevils, bark beetles, ladybird beetles, soldier beetles, stag beetles, water 

20 scavenger beetles, and a host of other beetles. A brief taxonomy of the Order is 
given at the website http://www.ncbi.nlm.nih.gov/Taxonomy/tax.html. 

Particularly important among the Coleoptera are the agricultural pests in- 
cluded within the infraorders Chrysomeliformia and Cucujiformia. Members of 
the infraorder Chrysomeliformia, including the leaf beetles (Chrysomelidae) and 

25 the weevils (Curculionidae), are particularly problematic to agriculture, and are 
responsible for a variety of insect damage to crops and plants. The infraorder Cu- 
cujiformia includes the families Coccinellidae, Cucujidae, Lagridae, Meloidae, 
Rhipiphoridae, and Tenebriomdae. Within this infraorder, members of the family 
Chrysomelidae (which includes the genera Exema, Chrysomela, Oreina, Chry- 

30 solina, Leptinotarsa, Gonioctena, Oulema, Monozia, Ophraella, Cerotoma, 
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Diabrotica, and Lachnaia). are well-known for their potential to destroy agricul- 
tural crops. 

As the toxins of the present invention have been shown to be effective in 
combatting a variety of members of the order Coleoptera, the inventors contem- 
5 plate that the insects of many Coleopteran genera may be controlled or eradicated 
using the polypeptide compositions described herein. Likewise, the methods de- 
scribed herein for generating modified polypeptides having enhanced insect speci- 
ficity may also be useful in extending the range of the insecticidal activity of the 
modified polypeptides to other insect species within, and outside of, the Order Co- 
1 0 leoptera. 

As such, the inventors contemplate that the crystal protein compositions 
disclosed herein will find particular utility as insecticides for topical and/or sys- 
temic application to field crops, including but not limited to rice, wheat, alfalfa, 
corn (maize), soybeans, tobacco, potato, barley, canola (rapeseed), sugarbeet, sug- 

1 5 arcane, flax, rye, oats, cotton, sunflower; grasses, such as pasture and turf grasses; 
fruits, citrus, nuts, trees, shrubs and vegetables; as well as ornamental plants, cacti, 
succulents, and the like. 

Disclosed and claimed is a composition comprising an insecticidally- 
effective amount of a Cry3Bb* crystal protein composition. The composition pref- 

20 erably comprises the amino acid sequence of SEQ ID NO:2, SEQ ID NO:4, SEQ 
ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID 
NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID 
NO:26, SEQ ID N0.28, SEQ ID NO.30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID 

25 NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID 
NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID 
NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:100, or SEQ ID NO:108 or 
biologically-functional equivalents thereof. 

The insecticide composition may also comprise a Cry3Bb* crystal protein 

30 that is encoded by a nucleic acid sequence having the sequence of SEQ ID NO:l, 
SEQ ID NO:3, SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ 
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ID NO:13. SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ 
ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ 
ID N0.33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ 
ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:5I, SEQ 
5 ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ 
ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, or SEQ 
ID NO: 108, or, alternatively, a nucleic acid sequence which hybridizes to the nu- 
cleic acid sequence of SEQ ID NO:K SEQ ID NO:3, SEQ ID NO:5. SEQ ID NO:7, 
SEQ ID NO:9 5 SEQ ID NO: 11, SEQ ID NO:13. SEQ ID NO:15, SEQ ID NO:17, 

10 SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, 
SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, 
SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, 
SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, 
SEQ ID NO:59, SEQ ID NO:6l, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, 

1 5 SEQ ID NO:69, SEQ ID NO:99, or SEQ ID NO: 1 07 under conditions of moderate 
stringency. 

The insecticidal compositions may comprise one or more B. thuringiensis 
cell types, or one or more cultures of such cells, or, alternatively, a mixture of one 
or more B. thuringiensis cells which express one or more of the novel crystal pro- 

20 teins of the invention in combination with another insecticidal composition. In 
certain aspects it may be desirable to prepare compositions which contain a plural- 
ity of crystal proteins, either native or modified, for treatment of one or more types 
of susceptible insects. The B. thuringiensis cells of the invention can be treated 
prior to formulation to prolong the insecticidal activity when the cells are applied 

25 to the environment of the target insect(s). Such treatment can be by chemical or 
physical means, or by a combination of chemical and/or physical means, so long as 
the technique does not deleteriously affect the properties of the insecticide, nor 
diminish the cellular capability in protecting the insecticide. Examples of chemical 
reagents are halogenerating agents, particularly halogens of atomic no. 17-80. 

30 More particularly, iodine can be used under mild conditions and for sufficient time 
to achieve the desired results. Other suitable techniques include treatment with al- 
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dehydes, such as formaldehyde and glutaraldehyde; anti-infectives, such as 
zephiran chloride; alcohols, such as isopropyl and ethanol; various histologic fixa- 
tives, such as Bouin's fixative and Helly's fixative (see Humason, 1967); or a 
combination of physical (heat) and chemical agents that prolong the activity of the 
5 5-endotoxin produced in the cell when the cell is applied to the environment of the 
target pest(s). Examples of physical means are short wavelength radiation such as 
gamma-radiation and X-radiation, freezing, UV irradiation, lyophilization, and the 
like. 

The inventors contemplate that any formulation methods known to those of 

10 skill in the art may be employed using the proteins disclosed herein to prepare such 
bioinsecticide compositions. It may be desirable to formulate whole cell prepara- 
tions, cell extracts, cell suspensions, cell homogenates, cell ly sates, cell super- 
natants, cell filtrates, or cell pellets of a cell culture (preferably a bacterial cell cul- 
ture such as a B. thuringiensis cell culture described in Table 3) that expresses one 

15 or more cry3Bb* DNA segments to produce the encoded Cry3Bb* protein(s) or 
peptide(s). The methods for preparing such formulations are known to those of 
skill in the art, and may include, e.g., desiccation, lyophilization, homogenization, 
extraction, filtration, centrifugation, sedimentation, or concentration of one or more 
cultures of bacterial cells, such as B. thuringiensis cells described in Table 3, which 

20 express the Cry3Bb* peptide(s) of interest 

In one preferred embodiment, the bioinsecticide composition comprises an 
oil flowable suspension comprising lysed or unlysed bacterial cells, spores, or 
crystals which contain one or more of the novel crystal proteins disclosed herein. 
Preferably the cells are B. thuringiensis cells, however, any such bacterial host cell 

25 expressing the novel nucleic acid segments disclosed herein and producing a crys- 
tal protein is contemplated to be useful, such as Bacillus spp., including B. 
megaterium, B. subtilis; B. cereus, Escherichia spp., including E. coli, and/or 
Pseudomonas spp., including P. cepacia, P. aeruginosa, and P. jluorescens. Al- 
ternatively, the oil flowable suspension may consist of a combination of one or 

30 more of the following compositions: lysed or unlysed bacterial cells, spores, crys- 
tals, and/or purified crystal proteins. 
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In a second preferred embodiment, the bioinsecticide composition com- 
prises a water dispersible granule or powder. This granule or powder may com- 
prise lysed or unlysed bacterial cells, spores, or crystals which contain one or more 
of the novel crystal proteins disclosed herein. Preferred sources for these compo- 
5 sitions include bacterial cells such as B. thuringiensis cells, however, bacteria of 
the genera Bacillus, Escherichia, and Pseudomonas which have been transformed 
with a DNA segment disclosed herein and expressing the crystal protein are also 
contemplated to be useful. Alternatively, the granule or powder may consist of a 
combination of one or more of the following compositions: lysed or unlysed bac- 
1 0 terial cells, spores, crystals, and/or purified crystal proteins. 

In a third important embodiment, the bioinsecticide composition comprises 
a wettable powder, spray, emulsion, colloid, aqueous or organic solution, dust, 
pellet, or collodial concentrate. Such a composition may contain either unlysed or 
lysed bacterial cells, spores, crystals, or cell extracts as described above, which 
1 5 contain one or more of the novel crystal proteins disclosed herein. Preferred bac- 
terial cells are B. thuringiensis cells, however, bacteria such as B. megaterium, B. 
subtilis, B. cereus, E. coli, or Pseudomonas spp. cells transformed with a DNA 
segment disclosed herein and expressing the crystal protein are also contemplated 
to be useful. Such dry forms of the insecticidal compositions may be formulated to 
20 dissolve immediately upon wetting, or alternatively, dissolve in a controlled- 
release, sustained-release, or other time-dependent manner. Alternatively, such a 
composition may consist of a combination of one or more of the following com- 
positions: lysed or unlysed bacterial cells, spores, crystals, and/or purified crystal 
proteins. 

25 In a fourth important embodiment, the bioinsecticide composition com- 

prises an aqueous solution or suspension or cell culture of lysed or unlysed bacte- 
rial cells, spores, crystals, or a mixture of lysed or unlysed bacterial cells, spores, 
and/or crystals, such as those described above which contain one or more of the 
novel crystal proteins disclosed herein. Such aqueous solutions or suspensions 

30 may be provided as a concentrated stock solution which is diluted prior to applica- 
tion, or alternatively, as a diluted solution ready-to-apply. 



WO 99/31 248 PCT/US98/26852 

80 

For these methods involving application of bacterial cells, the cellular host 
containing the Crystal protein gene(s) may be grown in any convenient nutrient 
medium, where the DNA construct provides a selective advantage, providing for a 
selective medium so thai substantially all or all of the cells retain the 
5 B. thuringiensis gene. These cells may then be harvested in accordance with con- 
ventional ways. Alternatively, the cells can be treated prior to harvesting. 

When the insecticidal compositions comprise B. thuringiensis cells, spores, 
and/or crystals containing the modified crystal protein(s) of interest, such compo- 
sitions may be formulated in a variety of ways. They may be employed as wettable 

10 powders, granules or dusts, by mixing with various inert materials, such as inor- 
ganic minerals (phyllosilicates, carbonates, sulfates, phosphates, and the like) or 
botanical materials (powdered corncobs, rice hulls, walnut shells, and the like). 
The formulations may include spreader-sticker adjuvants, stabilizing agents, other 
pesticidal additives, or surfactants. Liquid formulations may be aqueous-based or 

15 non-aqueous and employed as foams, suspensions, emulsifiable concentrates, or 
the like. The ingredients may include rheological agents, surfactants, emulsifiers, 
dispersants, or polymers. 

Alternatively, the novel Cry3Bb-derived mutated crystal proteins may be 
prepared by native or recombinant bacterial expression systems in vitro and iso- 

20 lated for subsequent field application. Such protein may be either in crude cell 
lysates, suspensions, colloids, etc. , or alternatively may be purified, refined, buff- 
ered, and/or further processed, before formulating in an active biocidal formula- 
tion- Likewise, under certain circumstances, it may be desirable to isolate crystals 
and/or spores from bacterial cultures expressing the crystal protein and apply solu- 

25 tions, suspensions, or collodial preparations of such crystals and/or spores as the 
active bioinsecticidal composition. 

Another important aspect of the invention is a method of controlling cole- 
opteran insects which are susceptible to the novel compositions disclosed herein. 
Such a method generally comprises contacting the insect or insect population, col- 

30 ony, etc., with an insecticidally-effective amount of a Cry3Bb* crystal protein 
composition. The method may utilize Cry3Bb* crystal proteins such as those 
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disclosed in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID 
NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID 
NO:20, SEQ ID NO:22, SEQ ID N0.24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID 
NO:30,-SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID 
5 NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID 
NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 
NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID 
NO:70, SEQ ID NO: 100, or SEQ ID NO: 108, or biologically functional equivalents 
thereof. 

10 Alternatively, the method may utilize one or more Cry3Bb* crystal proteins 

which are encoded by the nucleic acid sequences of SEQ ID NO:l, SEQ ID NO:3, 
SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO:13. SEQ 
ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21. SEQ ID NO:23, SEQ 
ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:3 1, SEQ ID NO:33, SEQ 

15 ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ 
ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ 
ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ 
ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID NO: 101, or 
SEQ ID NO: 107, or by one or more nucleic acid sequences which hybridize to the 

20 sequences of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5. SEQ ID NO:7, SEQ ID 
NO:9, SEQ ID NO:ll, SEQ ID NO:13. SEQ ID NO:15, SEQ ID NO:17, SEQ ID 
NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID 
NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID 
NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID 

25 NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID 
NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID 
NO:69, SEQ ID NO:99, SEQ ID NO:101, or SEQ ID NO:107, under conditions of 
moderate, or higher, stringency. The methods for identifying sequences which hy- 
bridize to those disclosed under conditions of moderate or higher stringency are 

30 well-known to those of skill in the art, and are discussed herein. 
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Regardless of the method of application, the amount of the active compo- 
nents) are applied at an insecticidally-effective amount, which will vary depending 
on such factors as, for example, the specific coleopteran insects to be controlled, 
the specific plant or crop to be treated, the environmental conditions, and the 
5 method, rate, and quantity of application of the insecticidally-active composition. 

The insecticide compositions described may be made by formulating either 
the bacterial cell, crystal and/or spore suspension, or isolated protein component 
with the desired agriculturally-acceptable carrier. The compositions may be formu- 
lated prior to administration in an appropriate means such as lyophilized, freeze- 

10 dried, dessicated, or in an aqueous carrier, medium or suitable diluent, such as sa- 
line or other buffer. The formulated compositions may be in the form of a dust or 
granular material, or a suspension in oil (vegetable or mineral), or water or 
oil/water emulsions, or as a wettable powder, or in combination with any other 
carrier material suitable for agricultural application. Suitable agricultural carriers 

15 can be solid or liquid and are well known in the art. The term "agriculturally- 
acceptable carrier" covers all adjuvants, e.g., inert components, dispersants, surfac- 
tants, tackifiers, binders, etc. that are ordinarily used in insecticide formulation 
technology; these are well known to those skilled in insecticide formulation. The 
formulations may be mixed with one or more solid or liquid adjuvants and pre- 

20 pared by various means, e.g., by homogeneously mixing, blending and/or grinding 
the insecticidal composition with suitable adjuvants using conventional formula- 
tion techniques. 

The insecticidal compositions of this invention are applied to the environ- 
ment of the target coleopteran insect, typically onto the foliage of the plant or crop 

25 to be protected, by conventional methods, preferably by spraying. The strength 
and duration of insecticidal application will be set with regard to conditions spe- 
cific to the particular pest(s), crop(s) to be treated and particular environmental 
conditions. The proportional ratio of active ingredient to carrier will naturally de- 
pend on the chemical nature, solubility, and stability of the insecticidal composi- 

30 tion, as well as the particular formulation contemplated. 
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Other application techniques, e.g., dusting, sprinkling, soaking, soil injec- 
tion, soil tilling, seed coating, seedling coating, spraying, aerating, misting, atomiz- 
ing, and the like, are also feasible and may be required under certain circumstances 
such as e.g., insects that cause root or stalk infestation, or for application to delicate 
5 vegetation or ornamental plants. These application procedures are also well-known 
to those of skill in the art. 

The insecticidal composition of the invention may be employed in the 
method of the invention singly or in combination with other compounds, including 
and not limited to other pesticides. The method of the invention may also be used 

10 in conjunction with other treatments such as surfactants, detergents, polymers or 
time-release formulations. The insecticidal compositions of the present invention 
may be formulated for either systemic or topical use. 

The concentration of insecticidal composition which is used for environ- 
mental, systemic, or foliar application will vary widely depending upon the nature 

15 of the particular formulation, means of application, environmental conditions, and 
degree of biocidal activity. Typically, the bioinsecticidal composition will be pres- 
ent in the applied formulation at a concentration of at least about 1 % by weight and 
may be up to and including about 99% by weight. Dry formulations of the com- 
positions may be from about 1% to about 99% or more by weight of the composi- 

20 tion, while liquid formulations may generally comprise from about 1% to about 
99% or more of the active ingredient by weight. Formulations which comprise in- 
tact bacterial cells will generally contain from about 10 4 to about 1 0 12 cells/mg 

The insecticidal formulation may be administered to a particular plant or 
target area in one or more applications as needed, with a typical field application 

25 rate per hectare ranging on the order of from about 1 g to about 1 kg, 2 kg, 5, kg, or 
more of active ingredient. 

4.8 Nucleic Acid Segments as Hybridization Probes and Primers 

In addition to their use in directing the expression of crystal proteins or 
30 peptides of the present invention, the nucleic acid sequences contemplated herein 
also have a variety of other uses. For example, they also have utility as probes or 
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primers in nucleic acid hybridization embodiments. As such, it is contemplated 
that nucleic acid segments that comprise a sequence region that consists of at least 
a 14 nucleotide long contiguous sequence that has the same sequence as, or is 
complementary to, a 14 nucleotide long contiguous DNA segment of SEQ ID 
5 NO:l, SEQ ID NO:3, SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, 
SEQ ID NO: 13. SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, 
SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, 
SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 
SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, 
10 SEQ ID NO:53, SEQ ID NO:55 ? SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, 
SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, 
SEQ ID NO: 101 , or SEQ ID NO: 1 07 will find particular utility. Longer contiguous 
identical or complementary sequences, e.g., those of about 20, 30, 40, 50, 100, 200, 
500, 1000, 2000, 5000, 10000 etc. (including all intermediate lengths and up to and 
15 including full-length sequences will also be of use in certain embodiments. 

The ability of such nucleic acid probes to specifically hybridize to crystal 
protein-encoding sequences will enable them to be of use in detecting the presence 
of complementary sequences in a given sample. However, other uses are envi- 
sioned, including the use of the sequence information for the preparation of mutant 
20 species primers, or primers for use in preparing other genetic constructions. 

Nucleic acid molecules having sequence regions consisting of contiguous 
nucleotide stretches of 10-14, 15-20, 30, 50, or even of 100-200 nucleotides or so, 
identical or complementary to DNA sequences of SEQ ID NO:l, SEQ ID NO:3, 
SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO: 13. SEQ 
25 ID NO:15, SEQ ID NO: 17, SEQ ID NO:I9, SEQ ID NO:21, SEQ ID NO:23, SEQ 
ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ 
ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ 
ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ 
ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ 
30 ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:99, SEQ ID NO:101, or 
SEQ ID NO: 107 are particularly contemplated as hybridization probes for use in, 
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e.g., Southern and Northern blotting. Smaller fragments will generally find use in 
hybridization embodiments, wherein the length of the contiguous complementary 
region may be varied, such as between about 10-14 and about 100 or 200 nucleo- 
tides, bat larger contiguous complementary stretches may be used, according to the 
5 length complementary sequences one wishes to detect. 

The use of a hybridization probe of about 14 nucleotides in length allows 
the formation of a duplex molecule that is both stable and selective. Molecules 
having contiguous complementary sequences over stretches greater than 14 bases 
in length are generally preferred, though, in order to increase stability and selectiv- 

10 ity of the hybrid, and thereby improve the quality and degree of specific hybrid 
molecules obtained. One will generally prefer to design nucleic acid molecules 
having gene-complementary stretches of 1 5 to 20 contiguous nucleotides, or even 
longer where desired. 

Of course, fragments may also be obtained by other techniques such as, 

1 5 e.g., by mechanical shearing or by restriction enzyme digestion. Small nucleic acid 
segments or fragments may be readily prepared by, for example, directly synthesiz- 
ing the fragment by chemical means, as is commonly practiced using an automated 
oligonucleotide synthesizer. Also, fragments may be obtained by application of 
nucleic acid reproduction technology, such as the PCR™ technology of U. S. Pat- 

20 ents 4,683,195 and 4,683,202 (each incorporated herein by reference), by introduc- 
ing selected sequences into recombinant vectors for recombinant production, and 
by other recombinant DNA techniques generally known to those of skill in the art 
of molecular biology. 

Accordingly, the nucleotide sequences of the invention may be used for 

25 their ability to selectively form duplex molecules with complementary stretches of 
DNA fragments. Depending on the application envisioned, one will desire to em- 
ploy varying conditions of hybridization to achieve varying degrees of selectivity 
of probe towards target sequence. For applications requiring high selectivity, one 
will typically desire to employ relatively stringent conditions to form the hybrids, 

30 e.g., one will select relatively low salt and/or high temperature conditions, such as 
provided by about 0.02 M to about 0.15 M NaCl at temperatures of about 50°C to 
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about 70°C. Such selective conditions tolerate little, if any, mismatch between the 
probe and the template or target strand, and would be particularly suitable for iso- 
lating crystal protein-encoding DNA segments. Detection of DNA segments via 
hybridization is well-known to those of skill in the art, and the teachings of U. S. 
5 Patents 4,965,188 and 5,176,995 (each incorporated herein by reference) are ex- 
emplary of the methods of hybridization analyses. Teachings such as those found 
in the texts of Maloy et aL, 1994; Segal 1976; Prokop, 1991; and Kuby, 1994, are 
particularly relevant. 

Of course, for some applications, for example, where one desires to prepare 
1 0 mutants employing a mutant primer strand hybridized to an underlying template or 
where one seeks to isolate crystal protein-encoding sequences from related species, 
functional equivalents, or the like, less stringent hybridization conditions will typi- 
cally be needed in order to allow formation of the heteroduplex. In these circum- 
stances, one may desire to employ conditions such as about 0. 15 M to about 0.9 M 
15 salt, at temperatures ranging from about 20°C to about 55°C. Cross-hybridizing 
species can thereby be readily identified as positively hybridizing signals with re- 
spect to control hybridizations. In any case, it is generally appreciated that condi- 
tions can be rendered more stringent by the addition of increasing amounts of for- 
mamide, which serves to destabilize the hybrid duplex in the same manner as in- 
20 creased temperature. Thus, hybridization conditions can be readily manipulated, 
and thus will generally be a method of choice depending on the desired results. 

In certain embodiments, it will be advantageous to employ nucleic acid se- 
quences of the present invention in combination with an appropriate means, such as 
a label, for determining hybridization. A wide variety of appropriate indicator 
25 means are known in the art, including fluorescent, radioactive, enzymatic or other 
ligands, such as avidin/biotin, which are capable of giving a detectable signal. In 
preferred embodiments, one will likely desire to employ a fluorescent label or an 
enzyme tag, such as urease, alkaline phosphatase or peroxidase, instead of radioac- 
tive or other environmental undesirable reagents. In the case of enzyme tags, col- 
30 orimetric indicator substrates are known that can be employed to provide a means 
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visible to the human eye or spcctrophotometrically, to identify specific hybridiza- 
tion with complementary nucleic acid-containing samples. 

In general, it is envisioned that the hybridization probes described herein 
will be useful both as reagents in solution hybridization as well as in embodiments 
5 employing a solid phase. In embodiments involving a solid phase, the test DNA 
(or RNA) is adsorbed or otherwise affixed to a selected matrix or surface. This 
fixed, single-stranded nucleic acid is then subjected to specific hybridization with 
selected probes under desired conditions. The selected conditions will depend on 
the particular circumstances based on the particular criteria required (depending, 
10 for example, on the G+C content, type of target nucleic acid, source of nucleic 
acid, size of hybridization probe, etc.). Following washing of the hybridized sur- 
face so as to remove nonspecifically bound probe molecules, specific hybridization 
is detected, or even quantitated, by means of the label. 

1 5 4.9 Characteristics of Modified Cry3 S-Endotoxins 

The present invention provides novel polypeptides that define a whole or a 
portion of a B. thuringiensis cry3Bb.60 5 cry3Bb,1122L cry3Bb.H222, 
cry3Bb.!1223, cry3Bb.H224, cry3Bb.H225, cry3Bb.l 1226, cry3Bb.H227, 
cry3Bb.H228, cry3Bb.l 1229, cry3Bb.H230, cry3Bb.l 1231, cry3Bb.H232, 
20 cry3Bb.H233, cry3Bb.H234, cry3Bb.l 1235, cry3Bb.l 1236, cry3Bb.H237, 
cry3Bb.H238, cry3Bb.l 1239, cry3Bb.l 1241, cry3Bb.l 1242. cry3Bb.H032, 
cry3Bb.H035, cry3Bb.l 1036, cry3Bb.l 1046, cry3Bb.l 1048, cry3Bb.H051, 
cry3Bb.H057, cry3Bb.l 1058, cry3Bb.ll081, cry3Bb.l 1082, cry3Bb.H083, 
cry3Bb.l 1084, cry3Bb.l 1095 and cry3Bb.l 1098-encoded crystal protein. 

25 

4.10 Crystal Protein Nomenclature 

The inventors have arbitrarily assigned the designations Cry3Bb.60, 
Cry3Bb.ll221, Cry3Bb.ll222, Cry3Bb.ll223, Cry3Bb.l 1224, Cry3Bb.l 1225, 
Cry3Bb.ll226, Cry3Bb.l 1227, Cry3Bb.ll228, Cry3Bb.l 1229, Cry3Bb.l 1230, 
30 Cry3Bb.ll231, Cry3Bb.l 1232, Cry3Bb.ll233, Cry3Bb.l 1234, Cry3Bb.l 1235, 
Cry3Bb.ll236, Cry3Bb,11237, Cry3Bb.ll238, Cry3Bb.l 1239, Cry3Bb.l 1241, 
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Cry3Bb.ll242, Cry3Bb.l 1032. Cry3Bb.l 1035, Cry3Bb.l 1036, Cry3Bb.l 1046, 
Cry3Bb. 11048, Cry 3Bb. 11051, Cry3Bb.l 1057, Cry3Bb.l 1058, Cry3Bb.l 1081, 
Cry3Bb.ll082, Cry3Bb.l 1083. Cry3Bb.ll084, Cry3Bb.ll095 and Cry3Bb.ll098 
to the novel proteins of the invention. 
5 Likewise, the arbitrary designations of cry3Bb,60, cry3Bb. 11221, 

cry3Bb. 11222, cry3Bb. 1 1223, crv3Bb. 11224, cry3Bb. 1 1225, cry3Bb. 11226, 
cry3Bh 11227, cry3Bb. 1 1228, cry3BbA1229, cry3Bb.l 1230, cry3BbA 1231, 
cry3Bb. 11232, cry3BbA 1233, cry3BbA 1234, cry3BbA1235, cry3BbA1236 i 
cry3Bb. 11237, cry3Bb, 1 1238. cry3Bb. 11239, cry3Bb. 11241 cry3Bb. 11242, 

10 cry3BbA1032 i cry3Bb.l 1035, cry3Bb.l 1036, cry3Bb. 11046, cry3BbA1048, 
cry3Bb, 11051, cry3BbA 1057, cry3Bb.l 1058, cry3BbA1081, cry3BbA1082, 
cry3BbA1083, cry3BbA1084, cry3BbA1095 and Cry3BbA1098 have been as- 
signed to the novel nucleic acid sequences which encode these polypeptides, re- 
spectively. While formal assignment of gene and protein designations based on the 

1 5 revised nomenclature of crystal protein endotoxins (Table 1 ) may be made by the 
committee on the nomenclature of B, thuringiensis, any re-designations of the 
compositions of the present invention are also contemplated to be fully within the 
scope of the present disclosure. 

20 4.1 1 Transformed Host Cells and Transgenic Plants 

A bacterium, a yeast cell, or a plant cell or a plant transformed with an ex- 
pression vector of the present invention is also contemplated. A transgenic bacte- 
rium, yeast cell, plant cell or plant derived from such a transformed or transgenic 
cell is also one aspect of the invention. 

25 Such transformed host cells are often desirable for use in the production of 

endotoxins and for expression of the various DNA gene constrcuts disclosed 
herein. In some aspects of the invention, it is often desirable to modulate, regulate, 
or otherwise control the expression of the gene segments disclosed herein. Such 
methods are routine to those of skill in the molecular genetic arts. Typically, when 

30 increased or over-expression of a particular gene is desired, various manipulations 
may be employed for enhancing the expression of the messenger RNA, particularly 
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by using an active promoter, as well as by employing sequences, which enhance 
the stability of the messenger RNA in the particular transformed host cell. 

Typically, the initiation and translational termination region will involve 
stop codon(s), a terminator region, and optionally, a polyadenylation signal. In the 
5 direction of transcription, namely in the 5' to 3' direction of the coding or sense 
sequence, the construct will involve the transcriptional regulatory region, if any, 
and the promoter, where the regulatory region may be either 5' or 3' of the pro- 
moter, the ribosomal binding site, the initiation codon, the structural gene having 
an open reading frame in phase with the initiation codon, the stop codon(s), the 

10 polyadenylation signal sequence, if any, and the terminator region. This sequence 
as a double strand may be used by itself for transformation of a microorganism 
host, but will usually be included with a DNA sequence involving a marker, where 
the second DNA sequence may be joined to the 8-endotoxin expression construct 
during introduction of the DNA into the host. 

15 By a marker is intended a structural gene which provides for selection of 

those hosts which have been modified or transformed. The marker will normally 
provide for selective advantage, for example, providing for biocide resistance, e.g., 
resistance to antibiotics or heavy metals; complementation, so as to provide protot- 
ropy to an auxotrophic host, or the like. Preferably, complementation is employed, 

20 so that the modified host may not only be selected, but may also be competitive in 
the field. One or more markers may be employed in the development of the con- 
structs, as well as for modifying the host. The organisms may be further modified 
by providing for a competitive advantage against other wild-type microorganisms 
in the field. For example, genes expressing metal chelating agents, e.g., sidero- 

25 phores, may be introduced into the host along with the structural gene expressing 
the 8-endotoxin. In this manner, the enhanced expression of a siderophore may 
provide for a competitive advantage for the 8-endotoxin-producing host, so that it 
may effectively compete with the wild-type microorganisms and stably occupy a 
niche in the environment. 

30 Where no functional replication system is present, the construct will also 

include a sequence of at least 50 basepairs (bp), preferably at least about 100 bp, 
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and usually not more than about 1000 bp of a sequence homologous with a se- 
quence in the host. In this way, the probability of legitimate recombination is en- 
hanced, so that the gene will be integrated into the host and stably maintained by 
the host. Desirably, the S-endotoxin gene will be in close proximity to the gene 
5 providing for complementation as well as the gene providing for the competitive 
advantage. Therefore, in the event that a 5-endotoxin gene is lost, the resulting or- 
ganism will be likely to also lose the complementing gene and/or the gene provid- 
ing for the competitive advantage, so that it will be unable to compete in the envi- 
ronment with the gene retaining the intact construct. 

10 The crystal protein-encoding gene can be introduced between the transcrip- 

tional and translational initiation region and the transcriptional and translational 
termination region, so as to be under the regulatory control of the initiation region. 
This construct will be included in a plasmid, which will include at least one repli- 
cation system, but may include more than one, where one replication system is 

15 employed for cloning during the development of the plasmid and the second repli- 
cation system is necessary for functioning in the ultimate host. In addition, one or 
more markers may be present, which have been described previously. Where inte- 
gration is desired, the plasmid will desirably include a sequence homologous with 
the host genome. 

20 The transformants can be isolated in accordance with conventional ways, 

usually employing a selection technique, which allows for selection of the desired 
organism as against unmodified organisms or transferring organisms, when pres- 
ent. The transformants then can be tested for pesticidal activity. 

Suitable host cells, where the pesticide-containing cells will be treated to 

25 prolong the activity of the S-endotoxin in the cell when the then treated cell is ap- 
plied to the environment of target pest(s), may include either prokaryotes or eu- 
karyotes, normally being limited to those cells which do not produce substances 
toxic to higher organisms, such as mammals. However, organisms which produce 
substances toxic to higher organisms could be used, where the 5-endotoxin is un- 

30 stable or the level of application sufficiently low as to avoid any possibility of tox- 
icity to a mammalian host. As hosts, of particular interest will be the prokaryotes 
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and the lower eukaryotes, such as fungi. Illustrative prokaryotes, both Gram- 
negative and -positive, include Enter obacteriaceae^ such as Escherichia^ Erwinia, 
Shigella^ Salmonella, and Proteus; Bacillaceae; Rhizobiceae, such as Rhizobium; 
Spirillaceae, such as photobacterium. Zymomonas, Serratia, Aeromonas, Vibrio, 
5 Desulfovibdo, Spirillum; Lactobacillaceae; phylloplane organisms such as mem- 
bers of the Pseudomonadaceae (including Pseudomonas spp. and Aceiobacter 
spp.); Azotobacteraceae and Nitrobacteraceae; Flavobacterium spp.; members of 
the Bacillaceae such as Lactobacillus spp., Bifidobacterium, and Bacillus spp., and 
the like. Particularly preferred host cells include Pseudomonas aeruginosa, Pseu- 
10 domonas fluorescens, Bacillus thuringiensis, Escherichia coli, Bacillus subtilis, 
and the like. 

Among eukaryotes are fungi, such as Phycomycetes and Ascomycetes, 
which includes yeast, such as Schizosaccharomyces; and Basidiomycetes, Rhodo- 
torula, Aureobasidium, Sporobolomyces, Saccharomyces spp., and Sporobolomy- 
15 ces spp. 

Characteristics of particular interest in selecting a host cell for purposes of 
production include ease of introducing the 8-endotoxin gene into the host, avail- 
ability of expression systems, efficiency of expression, stability of the pesticide in 
the host, and the presence of auxiliary genetic capabilities. Characteristics of inter- 

20 est for use as a pesticide microcapsule include protective qualities for the pesticide, 
such as thick cell walls, pigmentation, and intracellular packaging or formation of 
inclusion bodies; leaf affinity; lack of mammalian toxicity; attractiveness to pests 
for ingestion; ease of killing and fixing without damage to the 5-endotoxin; and the 
like. Other considerations include ease of formulation and handling, economics, 

25 storage stability, and the like. 

The cell will usually be intact and be substantially in the proliferative form 
when treated, rather than in a spore form, although in some instances spores may 
be employed. Treatment of the recombinant microbial cell can be done as dis- 
closed infra. The treated cells generally will have enhanced structural stability 

30 which will enhance resistance to environmental conditions. 
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Genes or other nucleic acid segments, as disclosed herein, can be inserted 
into host cells using a variety of techniques which are well known in the art. For 
example, a large number of cloning vectors comprising a replication system in E. 
coli and a marker that permits selection of the transformed cells are available for 
5 preparation for the insertion of foreign genes into higher organisms, including 
plants. The vectors comprise, for example, pBR322, pUC series, M13mp series, 
pACYC184, etc. Accordingly, the sequence coding for the 8-endotoxin can be in- 
serted into the vector at a suitable restriction site. The resulting plasmid is used for 
transformation into E. coli. The E. coli cells are cultivated in a suitable nutrient 

1 0 medium, then harvested and lysed. The plasmid is recovered. Sequence analysis, 
restriction analysis, electrophoresis, and other biochemical-molecular biological 
methods are generally carried out as methods of analysis. After each manipulation, 
the DNA sequence used can be cleaved and joined to the next DNA sequence. 
Fach plasmid sequence can be cloned in the same or other plasmids. Depending on 

15 the method of inserting desired genes into the plant, other DNA sequences may be 
necessary. 

Methods for DNA transformation of plant cells include Agrobacterium- 
mediated plant transformation, protoplast transformation, gene transfer into pollen, 
injection into reproductive organs, injection into immature embryos and particle 

20 bombardment. Each of these methods has distinct advantages and disadvantages. 
Thus, one particular method of introducing genes into a particular plant strain may 
not necessarily be the most effective for another plant strain, but it is well known 
which methods are useful for a particular plant strain. 

Suitable methods are believed to include virtually any method by which 

25 DNA can be introduced into a cell, such as by Agrobacterium infection, direct de- 
livery of DNA such as, for example, by PEG-mediated transformation of proto- 
plasts (Omirulleh et aU 1993), by desiccation/inhibition-mediated DNA uptake, by 
electroporation, by agitation with silicon carbide fibers, by acceleration of DNA 
coated particles, etc. In certain embodiments, acceleration methods are preferred 

30 and include, for example, microprojectile bombardment and the like. 
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Technology for introduction of DNA into cells is well-known to those of 
skill in the art. Four general methods for delivering a gene into cells have been de- 
scribed: (1) chemical methods (Graham and van der Eb, 1973; Zatloukal etai, 
1992); (2) physical methods such as microinjection (Capecchi, 1980), electropora- 
5 tion (Wong and Neumann, 1982; Fromm et aL 9 1985) and the gene gun (Johnston 
and Tang, 1994; Fynan et aL, 1993); (3) viral vectors (Clapp, 1993; Lu et ai, 1993; 
Eglitis and Anderson, 1988; Eglitis et ai, 1988); and (4) receptor-mediated 
mechanisms (Curiel et ai, 1991; 1992; Wagner et a/., 1992). 

A large number of techniques are available for inserting DNA into a plant 

10 host cell. Those techniques include transformation with T-DNA using Agrobacte- 
rium tumefaciens or Agrobactedum rhizogenes as transformation agent, fusion, in- 
jection, or electroporation as well as other possible methods. If agrobacteria are 
used for the transformation, the DNA to be inserted has to be cloned into special 
plasmids, namely either into an intermediate vector or into a binary vector. The 

15 intermediate vectors can be integrated into the Ti or Ri plasmid by homologous 
recombination owing to sequences that are homologous to sequences in the T- 
DNA. The Ti or Ri plasmid also comprises the vir region necessary for the transfer 
of the T-DNA. 

Intermediate vectors cannot replicate themselves in agrobacteria. The in- 
20 termediate vector can be transferred into Agrobacterium tumefaciens by means of a 
helper plasmid (conjugation). Binary vectors can replicate themselves both in E. 
coli and in agrobacteria. They comprise a selection marker gene and a linker or 
polylinker which are framed by the right and left T-DNA border regions. They can 
be transfonned directly into agrobacteria (Holsters et ai, 1978). The agrobacte- 
25 rium used as host cell is to comprise a plasmid carrying a vir region. The vir re- 
gion is necessary for the transfer of the T-DNA into the plant cell. Additional t- 
DNA may be contained. The bacterium so transformed is used for the transforma- 
tion of plant cells. Plant explants can advantageously be cultivated with Agrobac- 
terium tumefaciens or Agrobacterium rhizogenes for the transfer of the DNA into 
30 the plant cell. Whole plants can then be regenerated from the infected plant mate- 
rial (for example, pieces of leaf, segments of stalk, roots, but also protoplasts or 
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suspension-cultivated cells) in a suitable medium, which may contain antibiotics or 
biocides for selection. The plants so obtained can then be tested for the presence of 
the inserted DNA. No special demands are made of the plasmids in the case of in- 
jection-and electroporation. It is possible to use ordinary plasmids, such as, for ex- 
5 ample, pUC derivatives. If, for example, the Ti or Ri plasmid is used for the trans- 
formation of the plant cell, then at least the right border, but often the right and the 
left border of the Ti or Ri plasmid T-DNA, has to be joined as the flanking region 
of the genes to be inserted. The use of T-DNA for the transformation of plant cells 
has been intensively researched and sufficiently described in Eur, Pat. Appl. 

10 No. EP 120 516; Hockema (1985); An et ai, 1985, Herrera-Estrella et aL, (1983), 
Bevan et al % (1983), and Klee et aL, (1985). 

A particularly useful Ti plasmid cassette vector for transformation of di- 
cotyledonous plants consists of the enhanced CaMV35S promoter (EN35S) and the 
3' end including polyadenylation signals from a soybean gene encoding the 

15 a'-subunit of P-conglycinin. Between these two elements is a multilinker contain- 
ing multiple restriction sites for the insertion of genes of interest. 

The vector preferably contains a segment of pBR322 which provides an 
origin of replication in E. coli and a region for homologous recombination with the 
disarmed T-DNA in Agrobacterium strain ACO; the oriV region from the broad 

20 host range plasmid RK1; the streptomycin/spectinomycin resistance gene from 
Tn7; and a chimeric NPTII gene, containing the CaMV35S promoter and the no- 
paline synthase (NOS) 3' end, which provides kanamycin resistance in transformed 
plant cells. 

Optionally, the enhanced CaMV35S promoter may be replaced with the 1 .5 
25 kb mannopine synthase (MAS) promoter (Velten et al y 1984). After incorporation 
of a DNA construct into the vector, it is introduced into A. tumefaciens strain ACO 
which contains a disarmed Ti plasmid. Cointegrate Ti plasmid vectors are selected 
and subsequentially may be used to transform a dicotyledonous plant. 

A. tumefaciens ACO is a disarmed strain similar to pTiB6SE described by 
30 Fraley et al (1985). For construction of ACO the starting Agrobacterium strain 
was the strain A208 which contains a nopaline-type Ti plasmid. The Ti plasmid 
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was disarmed in a manner similar to that described by Fraley et al. (1985) so that 
essentially all of the native T-DNA was removed except for the left border and a 
few hundred base pairs of T-DNA inside the left border. The remainder of the T- 
DNA extending to a point just beyond the right border was replaced with a novel 
5 piece of DNA including (from left to right) a segment of pBR322, the oriV region 
from plasmid RK2, and the kanamycin resistance gene from Tn601. The pBR322 
and oriV segments are similar to these segments and provide a region of homology 
for cointegrate formation. 

Once the inserted DNA has been integrated in the genome, it is relatively 
10 stable there and, as a rule, does not come out again. It normally contains a selec- 
tion marker that confers on the transformed plant cells resistance to a biocide or an 
antibiotic, such as kanamycin, G 418, bleomycin, hygromycin, or chloramphenicol, 
inter alia. The individually employed marker should accordingly permit the selec- 
tion of transformed cells rather than cells that do not contain the inserted DNA. 

15 

4.11.1 Electroporation 

The application of brief, high- voltage electric pulses to a variety of animal 
and plant cells leads to the formation of nanometer-sized pores in the plasma mem- 
brane. DNA is taken directly into the cell cytoplasm either through these pores or 

20 as a consequence of the redistribution of membrane components that accompanies 
closure of the pores. Electroporation can be extremely efficient and can be used 
both for transient expression of clones genes and for establishment of cell lines that 
carry integrated copies of the gene of interest. Electroporation, in contrast to cal- 
cium phosphate-mediated transfection and protoplast fusion, frequently gives rise 

25 to cell lines that carry one, or at most a few, integrated copies of the foreign DNA. 

The introduction of DNA by means of electroporation, is well-known to 
those of skill in the art. In this method, certain cell wall -degrading enzymes, such 
as pectin-degrading enzymes, are employed to render the target recipient cells more 
susceptible to transformation by electroporation than untreated cells. Alternatively, 

30 recipient cells are made more susceptible to transformation, by mechanical 
wounding. To effect transformation by electroporation one may employ either fri- 
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able tissues such as a suspension culture of cells, or embryogenic callus, or alter- 
natively, one may transform immature embryos or other organized tissues directly. 
One would partially degrade the cell walls of the chosen cells by exposing them to 
pectin-degrading enzymes (pectolyases) or mechanically wounding in a controlled 
5 manner. Such cells would then be recipient to DNA transfer by electroporation, 
which may be carried out at this stage, and transformed cells then identified by a 
suitable selection or screening protocol dependent on the nature of the newly in- 
corporated DNA. 

10 4. 1 1 .2 MlCROPROJECTILE BOMBARDMENT 

A further advantageous method for delivering transforming DNA segments 
to plant cells is microprojectile bombardment. In this method, particles may be 
coated with nucleic acids and delivered into cells by a propelling force. Exemplary 
particles include those comprised of tungsten, gold, platinum, and the like. 

15 An advantage of microprojectile bombardment, in addition to it being an ef- 

fective means of reproducibly stably transforming monocots* is that neither the 
isolation of protoplasts (Cristou et al, 1988) nor the susceptibility to Agrobacte- 
rium infection is required. An illustrative embodiment of a method for delivering 
DNA into maize cells by acceleration is a Biolistics Particle Delivery System, 

20 which can be used to propel particles coated with DNA or cells through a screen, 
such as a stainless steel or Nytex screen, onto a filter surface covered with corn 
cells cultured in suspension. The screen disperses the particles so that they are not 
delivered to the recipient cells in large aggregates. It is believed that a screen in- 
tervening between the projectile apparatus and the cells to be bombarded reduces 

25 the size of projectiles aggregate and may contribute to a higher frequency of trans- 
formation by reducing damage inflicted on the recipient cells by projectiles that 
are too large. 

For the bombardment, cells in suspension are preferably concentrated on 
filters or solid culture medium. Alternatively, immature embryos or other target 
30 cells may be arranged on solid culture medium. The cells to be bombarded are 
positioned at an appropriate distance below the macroprojectile stopping plate. If 
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desired, one or more screens are also positioned between the acceleration device 
and the cells to be bombarded. Through the use of techniques set forth herein one 
may obtain up to 1000 or more foci of cells transiently expressing a marker gene. 
The number of cells in a focus which express the exogenous gene product 48 hours 
5 post-bombardment often range from 1 to 10 and average 1 to 3. 

In bombardment transformation, one may optimize the prebombardment 
culturing conditions and the bombardment parameters to yield the maximum num- 
bers of stable transformants. Both the physical and biological parameters for bom- 
bardment are important in this technology. Physical factors are those that involve 

10 manipulating the DNA/microprojectile precipitate or those that affect the flight and 
velocity of either the macro- or microprojectiles. Biological factors include all 
steps involved in manipulation of cells before and immediately after bombardment, 
the osmotic adjustment of target cells to help alleviate the trauma associated with 
bombardment, and also the nature of the transforming DNA, such as linearized 

15 DNA or intact supercoiled plasmids. It is believed that pre-bombardment manipu- 
lations are especially important for successful transformation of immature em- 
bryos. 

Accordingly, it is contemplated that one may wish to adjust various of the 
bombardment parameters in small scale studies to fully optimize the conditions. 

20 One may particularly wish to adjust physical parameters such as gap distance, 
flight distance, tissue distance, and helium pressure. One may also minimize the 
trauma reduction factors (TRFs) by modifying conditions which influence the 
physiological state of the recipient cells and which may therefore influence trans- 
formation and integration efficiencies. For example, the osmotic state, tissue hy- 

25 dration and the subculture stage or cell cycle of the recipient cells may be adjusted 
for optimum transformation. The execution of other routine adjustments will be 
known to those of skill in the art in light of the present disclosure. 

4. 11 .3 A grobacterium-Medi ated Transfer 

30 Agrobacterium-mediated transfer is a widely applicable system for intro- 

ducing genes into plant cells because the DNA can be introduced into whole plant 
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tissues, thereby bypassing the need for regeneration of an intact plant from a pro- 
toplast. The use of Agrobacterium-mediated plant integrating vectors to introduce 
DNA into plant cells is well known in the art. See, for example, the methods de- 
scribed {Fraley et at., 1985; Rogers et aL, 1987). Further, the integration of the Ti- 
5 DNA is a relatively precise process resulting in few rearrangements. The region of 
DNA to be transferred is defined by the border sequences, and intervening DNA is 
usually inserted into the plant genome as described (Spielmann et aL, 1986; Jor- 
gensen et aL, 1987). 

Modern Agrobacterium transformation vectors are capable of replication in 
10 E. coli as, well as Agrobacterium, allowing for convenient manipulations as de- 
scribed (Klee et aL, 1985). Moreover, recent technological advances in vectors for 
Agrobacterium-mediated gene transfer have improved the arrangement of genes 
and restriction sites in the vectors to facilitate construction of vectors capable of 
expressing various polypeptide coding genes. The vectors described (Rogers et aL, 
15 1987), have convenient multi-linker regions flanked by a promoter and a poly- 
adenylation site for direct expression of inserted polypeptide coding genes and are 
suitable for present purposes. In addition, Agrobacterium containing both armed 
and disarmed Ti genes can be used for the transformations. In those plant strains 
where Agrobacterium-mcdiatod transformation is efficient, it is the method of 
20 choice because of the facile and defined nature of the gene transfer. 

Agrobacterium-mediated transformation of leaf disks and other tissues such 
as cotyledons and hypocotyls appears to be limited to plants that Agrobacterium 
naturally infects. Agrobacterium-mzdiaUd transformation is most efficient in di- 
cotyledonous plants. Few monocots appear to be natural hosts for Agrobacterium, 
25 although transgenic plants have been produced in asparagus using Agrobacterium 
vectors as described (Bytebier et aL, 1987). Therefore, commercially important 
cereal grains such as rice, corn, and wheat must usually be transformed using alter- 
native methods. However, as mentioned above, the transformation of asparagus 
using Agrobacterium can also be achieved (see, for example, Bytebier et aL, 1987). 
30 A transgenic plant formed using Agrobacterium transformation methods 

typically contains a single gene on one chromosome. Such transgenic plants can 
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be referred to as being heterozygous for the added gene. However, inasmuch as 
use of the word "heterozygous" usually implies the presence of a complementary 
gene at the same locus of the second chromosome of a pair of chromosomes, and 
there is no such gene in a plant containing one added gene as here, it is believed 
5 that a more accurate name for such a plant is an independent segregant, because the 
added, exogenous gene segregates independently during mitosis and meiosis. 

More preferred is a transgenic plant that is homozygous for the added 
structural gene; i.e., a transgenic plant that contains two added genes, one gene at 
the same locus on each chromosome of a chromosome pair. A homozygous trans- 
10 genie plant can be obtained by sexually mating (selfmg) an independent segregant 
transgenic plant that contains a single added gene, germinating some of the seed 
produced and analyzing the resulting plants produced for enhanced carboxylase 
activity relative to a control (native, non-transgenic) or an independent segregant 
transgenic plant. 

15 It is to be understood that two different transgenic plants can also be mated 

to produce offspring that contain two independently segregating added, exogenous 
genes. Selfing of appropriate progeny can produce plants that are homozygous for 
both added, exogenous genes that encode a polypeptide of interest. Back-crossing 
to a parental plant and out-crossing with a non-transgenic plant are also contem- 

20 plated. 

Transformation of plant protoplasts can be achieved using methods based 
on calcium phosphate precipitation, polyethylene glycol treatment, electroporation, 
and combinations of these treatments (see, e.g., Potrykus et al, 1985; Lorz et al, 
1985; Fromm et al, 1985; Uchimiya et al, 1986; Callis et al, 1987; Marcotte et 
25 al 9 1988). 

Application of these systems to different plant strains depends upon the 
ability to regenerate that particular plant strain from protoplasts. Illustrative meth- 
ods for the regeneration of cereals from protoplasts are described (Fujimura et al, 
1985; Toriyama et al, 1986; Yamada et al, 1986; Abdullah et al, 1986). 
30 To transform plant strains that cannot be successfully regenerated from 

protoplasts, other ways to introduce DNA into intact cells or tissues can be utilized. 
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For example, regeneration of cereals from immature embryos or explants can be 
effected as described (Vasil, 1988). In addition, "particle gun" or high-velocity 
microprojectilc technology can be utilized (Vasil, 1992). 

Using that latter technology, DNA is carried through the cell wall and into 
5 the cytoplasm on the surface of small metal particles as described (Klein et ai 9 
1987; Klein et al 9 1988; McCabe et ai $ 1988). The metal particles penetrate 
through several layers of cells and thus allow the transformation of cells within tis- 
sue explants. 



10 4.11.4 Gene Expression in Plants 

Although great progress has been made in recent years with respect to 
preparation of transgenic plants which express bacterial proteins such as 
B. thuringiensis crystal proteins, the results of expressing native bacterial genes in 
plants are often disappointing. Unlike microbial genetics, little was known by 

1 5 early plant geneticists about the factors which affected heterologous expression of 
foreign genes in plants. In recent years, however, several potential factors have 
been implicated as responsible in varying degrees for the level of protein expres- 
sion from a particular coding sequence. For example, scientists now know that 
maintaining a significant level of a particular mRNA in the cell is indeed a critical 

20 factor. Unfortunately, the causes for low steady state levels of mRNA encoding 
foreign proteins are many. First, full length RNA synthesis may not occur at a 
high frequency. This could, for example, be caused by the premature termination 
of RNA during transcription or due to unexpected mRNA processing during tran- 
scription. Second, full length RNA may be produced in the plant cell, but then 

25 processed (splicing, polyA addition) in the nucleus in a fashion that creates a non- 
functional mRNA. If the RNA is not properly synthesized, terminated and poly- 
adenylated, it cannot move to the cytoplasm for translation. Similarly, in the cy- 
toplasm, if mRNAs have reduced half lives (which are determined by their primary 
or secondary sequence) insufficient protein product will be produced. In addition, 

30 there is an effect, whose magnitude is uncertain, of translational efficiency on 
mRNA half-life. In addition, every RNA molecule folds into a particular structure, 
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or perhaps family of structures, which is determined by its sequence. The particu- 
lar structure of any RNA might lead to greater or lesser stability in the cytoplasm. 
Structure per se is probably also a determinant of mRNA processing in the nucleus. 
Unfortunately, it is impossible to predict, and nearly impossible to determine, the 
5 structure of any RNA (except for tRNA) in vitro or in vivo. However, it is likely 
that dramatically changing the sequence of an RNA will have a large effect on its 
folded structure It is likely that structure per se or particular structural features 
also have a role in determining RNA stability. 

To overcome these limitations in foreign gene expression, researchers have 
10 identified particular sequences and signals in RNAs that have the potential for 
having a specific effect on RNA stability. In certain embodiments of the invention, 
therefore, there is a desire to optimize expression of the disclosed nucleic acid 
segments in planta. One particular method of doing so, is by alteration of the bac- 
terial gene to remove sequences or motifs which decrease expression in a trans- 

15 formed plant cell. The process of engineering a coding sequence for optimal ex- 
pression in planta is often referred to as "plantizing" a DNA sequence. 

Particularly problematic sequences are those which are A+T rich. Unfortu- 
nately, since B. thuringiensis has an A+T rich genome, native crystal protein gene 
sequences must often be modified for optimal expression in a plant. The sequence 

20 motif ATTTA (or AUUUA as it appears in RNA) has been implicated as a desta- 
bilizing sequence in mammalian cell mRNA (Shaw and Kamen, 1986). Many 
short lived mRNAs have A+T rich 3' untranslated regions, and these regions often 
have the ATTTA sequence, sometimes present in multiple copies or as multimers 
{e.g., ATTTATTTA...). Shaw and Kamen showed that the transfer of the 3' end of 

25 an unstable mRNA to a stable RNA (globin or VA1) decreased the stable RNAs 
half life dramatically. They further showed that a pentamer of ATTTA had a pro- 
found destabilizing effect on a stable message, and that this signal could exert its 
effect whether it was located at the 3' end or within the coding sequence. How- 
ever, the number of ATTTA sequences and/or the sequence context in which they 

30 occur also appear to be important in determining whether they function as destabi- 
lizing sequences. Shaw and Kamen showed that a trimer of ATTTA had much less 
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effect than a pentamer on mRNA stability and a dimer or a monomer had no effect 
on stability (Shaw and Kamen, 1987). Note that multimers of ATTTA such as a 
pentamer automatically create an A+T rich region. This was shown to be a cyto- 
plasmic effect, not nuclear. In other unstable mRNAs, the ATTTA sequence may 
5 be present in only a single copy, but it is often contained in an A+T rich region. 
From the animal cell data collected to date, it appears that ATTTA at least in some 
contexts is important in stability, but it is not yet possible to predict which occur- 
rences of ATTTA are destabiling elements or whether any of these effects are 
likely to be seen in plants. 
10 Some studies on mRNA degradation in animal cells also indicate that RNA 

degradation may begin in some cases with nucleolytic attack in A+T rich regions. 
It is not clear if these cleavages occur at ATTTA sequences. There are also exam- 
ples of mRNAs that have differential stability depending on the cell type in which 
they are expressed or on the stage within the cell cycle at which they are expressed. 
15 For example, histone mRNAs are stable during DNA synthesis but unstable if 
DNA synthesis is disrupted. The 3' end of some histone mRNAs seems to be re- 
sponsible for this effect (Pandey and Marzluff, 1987). It does not appear to be 
mediated by ATTTA, nor is it clear what controls the differential stability of this 
mRNA. Another example is the differential stability of IgG mRNA in B lympho- 
20 cytes during B cell maturation (Genovese and Milcarek, 1988). A final example is 
the instability of a mutant P-thallesemic globin mRNA. In bone marrow cells, 
where this gene is normally expressed, the mutant mRNA is unstable, while the 
wild-type mRNA is stable. When the mutant gene is expressed in HeLa or L cells 
in vitro* the mutant mRNA shows no instability (Lim et a/., 1988). These exam- 
25 pies all provide evidence that mRNA stability can be mediated by cell type or cell 
cycle specific factors. Furthermore this type of instability is not yet associated 
with specific sequences. Given these uncertainties, it is not possible to predict 
which RNAs are likely to be unstable in a given cell. In addition, even the ATTTA 
motif may act differentially depending on the nature of the cell in which the RNA 
30 is present. Shaw and Kamen (1987) have reported that activation of protein kinase 
C can block degradation mediated by ATTTA. 
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The addition of a polyadenylatc string to the 3' end is common to most eu- 
karyotic mRNAs, both plant and animal. The currently accepted view of polyA 
addition is that the nascent transcript extends beyond the mature 3' terminus. 
Contained within this transcript are signals for polyadenylation and proper 3' end 
5 formation. This processing at the 3' end involves cleavage of the mRNA and addi- 
tion of polyA to the mature 3' end. By searching for consensus sequences near the 
polyA tract in both plant and animal mRNAs, it has been possible to identify con- 
sensus sequences that apparently are involved in polyA addition and 3' end cleav- 
age. The same consensus sequences seem to be important to both of these proc- 

10 esses. These signals are typically a variation on the sequence AATAAA. In ani- 
mal cells, some variants of this sequence that are functional have been identified; 
in plant cells there seems to be an extended range of functional sequences 
(Wickens and Stephenson, 1984; Dean et aL, 1986). Because all of these consen- 
sus sequences are variations on AATAAA, they all are A+T rich sequences. This 

15 sequence is typically found 15 to 20 bp before the polyA tract in a mature mRNA. 
Studies in animal cells indicate that this sequence is involved in both polyA addi- 
tion and y maturation. Site directed mutations in this sequence can disrupt these 
functions (Conway and Wickens, 1988; Wickens et ai 9 1987). However, it has 
also been observed that sequences up to 50 to 100 bp 3' to the putative polyA sig- 

20 nal are also required; ue. 9 a gene that has a normal AATAAA but has been replaced 
or disrupted downstream does not get properly polyadenylated (Gil and Proudfoot, 
1984; Sadofsky and Alwine, 1984; McDevitt et ai, 1984). That is, the polyA sig- 
nal itself is not sufficient for complete and proper processing. It is not yet known 
what specific downstream sequences are required in addition to the polyA signal, 

25 or if there is a specific sequence that has this function. Therefore, sequence analy- 
sis can only identify potential polyA signals. 

In naturally occurring mRNAs that are normally polyadenylated, it has been 
observed that disruption of this process, either by altering the polyA signal or other 
sequences in the mRNA, profound effects can be obtained in the level of functional 

30 mRNA. This has been observed in several naturally occurring mRNAs, with re- 
sults that are gene-specific so far. 
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It has been shown that in natural mRNAs proper polyadenylation is impor- 
tant in mRNA accumulation, and that disruption of this process can effect mRNA 
levels significantly. However, insufficient knowledge exists to predict the effect of 
changes in a normal gene. In a heterologous gene, it is even harder to predict the 
5 consequences. However, it is possible that the putative sites identified are dys- 
functional. That is, these sites may not act as proper poly A sites, but instead func- 
tion as aberrant sites that give rise to unstable mRNAs. 

In animal cell systems, AATAAA is by far the most common signal identi- 
fied in mRNAs upstream of the polyA. but at least four variants have also been 

10 found (Wickens and Stephenson, 1984). In plants, not nearly so much analysis has 
been done, but it is clear that multiple sequences similar to AATAAA can be used. 
The plant sites in Table 5 called major or minor refer only to the study of Dean et 
al (1986) which analyzed only three types of plant gene. The designation of poly- 
adenylation sites as major or minor refers only to the frequency of their occurrence 

1 5 as functional sites in naturally occurring genes that have been analyzed. In the case 
of plants this is a very limited database. It is hard to predict with any certainty that 
a site designated major or minor is more or less likely to function partially or 
completely when found in a heterologous gene such as those encoding the crystal 
proteins of the present invention. 



20 
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Table 5 

polyadenylation sites in plant genes 
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A A TP A A 

AA I LAA 
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r /A 


A TV 1 AAA 

AIQjAAA 
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A A OP A T 

A AVjrL, A 1 


ii 


P9A 


ATTAAT 


it 


PI OA 


ATACAT 


ii 


P11A 


AAAATA 


ii 


P12A 


ATT AAA 


Minor animal site 


P13A 


AATTAA 


1 1 


P14A 


AATACA 


ii 


P15A 


CATAAA 


it 



The present invention provides a method for preparing synthetic plant 
5 genes which genes express their protein product at levels significantly higher than 
the wild-type genes which were commonly employed in plant transformation 
heretofore. In another aspect, the present invention also provides novel synthetic 
plant genes which encode non-plant proteins. 

As described above, the expression of native B, thuringiensis genes in 
10 plants is often problematic. The nature of the coding sequences of B. thuringiensis 
genes distinguishes them from plant genes as well as many other heterologous 
genes expressed in plants. In particular, B. thuringiensis genes are very rich 
(-62%) in adenine (A) and thymine (T) while plant genes and most other bacterial 
genes which have been expressed in plants are on the order of 45-55% A+T. 
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Due to the degeneracy of the genetic code and the limited number of codon 
choices for any amino acid, most of the "excess" A+T of the structural coding se- 
quences of some Bacillus species are found in the third position of the codons. 
That is", genes of some Bacillus species have A or T as the third nucleotide in many 
5 codons. Thus A+T content in part can determine codon usage bias. In addition, it 
is clear that genes evolve for maximum function in the organism in which they 
evolve. This means that particular nucleotide sequences found in a gene from one 
organism, where they may play no role except to code for a particular stretch of 
amino acids, have the potential to be recognized as gene control elements in an- 

10 other organism (such as transcriptional promoters or terminators, polyA addition 
sites, intron splice sites, or specific mRNA degradation signals). It is perhaps sur- 
prising that such misread signals are not a more common feature of heterologous 
gene expression, but this can be explained in part by the relatively homogeneous 
A+T content (-50%) of many organisms. This A+T content plus the nature of the 

15 genetic code put clear constraints on the likelihood of occurrence of any particular 
oligonucleotide sequence. Thus, a gene from E. coli with a 50% A+T content is 
much less likely to contain any particular A+T rich segment than a gene from 
B. thuringiensis. 

Typically, to obtain high-level expression of the S-endotoxin genes in 
20 plants, existing structural coding sequence ("structural gene") which codes for the 
S-endotoxin are modified by removal of ATTTA sequences and putative poly- 
adenylation signals by site directed mutagenesis of the DNA comprising the struc- 
tural gene. It is most preferred that substantially all the polyadenylation signals 
and ATTTA sequences are removed although enhanced expression levels are ob- 
25 served with only partial removal of either of the above identified sequences. Al- 
ternately if a synthetic gene is prepared which codes for the expression of the sub- 
ject protein, codons are selected to avoid the ATTTA sequence and putative poly- 
adenylation signals. For purposes of the present invention putative polyadenyla- 
tion signals include, but are not necessarily limited to, AATAAA, AATAAT, 
30 AACCAA, ATATAA, AATCAA, ATACTA, ATAAAA, ATGAAA, AAGCAT, 
ATTAAT, ATACAT, AAAATA, ATT AAA, AATTAA, AATACA and 



WO 99/31248 PCTAJS98/26852 

107 

CATAAA. In replacing the ATTTA sequences and polyadenylation signals, 
codons are preferably utilized which avoid the codons which are rarely found in 
plant genomes. 

The selected DN A sequence is scanned to identify regions with greater than 
5 four consecutive adenine (A) or thymine (T) -nucleotides. The A+T regions are 
scanned for potential plant polyadenylation signals. Although the absence of five 
or more consecutive A or T nucleotides eliminates most plant polyadenylation sig- 
nals, if there are more than one of the minor polyadenylation signals identified 
within ten nucleotides of each other, then the nucleotide sequence of this region is 

10 preferably altered to remove these signals while maintaining the original encoded 
amino acid sequence. 

The second step is to consider the about 1 5 to about 30 or so nucleotide 
residues surrounding the A+T rich region identified in step one. If the A+T con- 
tent of the surrounding region is less than 80%, the region should be examined for 

15 polyadenylation signals. Alteration of the region based on polyadenylation signals 
is dependent upon (1) the number of polyadenylation signals present and (2) pres- 
ence of a major plant polyadenylation signal. 

The extended region is examined for the presence of plant polyadenylation 
signals. The polyadenylation signals are removed by site-directed mutagenesis of 

20 the DNA sequence. The extended region is also examined for multiple copies of 
the ATTTA sequence which are also removed by mutagenesis. 

It is also preferred that regions comprising many consecutive A+T bases or 
G+C bases are disrupted since these regions are predicted to have a higher likeli- 
hood to form hairpin structure due to self-complementarity. Therefore, insertion of 

25 heterogeneous base pairs would reduce the likelihood of self-complementary sec- 
ondary structure formation which are known to inhibit transcription and/or transla- 
tion in some organisms. In most cases, the adverse effects may be minimized by 
using sequences which do not contain more than five consecutive A+T or G+C. 
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4.11.5 Synthetic Oligonucleotides for Mutagenesis 

When oligonucleotides are used in the mutagenesis, it is desirable to main- 
tain the proper amino acid sequence and reading frame, without introducing com- 
mon restriction sites such as Bglll, Hindltt, Sacl, Kpnl, £coRI, Ncoh Pstl and Sail 
5 into the modified gene. These restriction sites are found in poly-linker insertion 
sites of many cloning vectors. Of course, the introduction of new polyadenylation 
signals, ATTTA sequences or consecutive stretches of more than five A+T or 
G+C, should also be avoided. The preferred size for the oligonucleotides is about 
40 to about 50 bases, but fragments ranging from about 18 to about 100 bases have 

10 been utilized. In most cases, a minimum of about 5 to about 8 base pairs of ho- 
mology to the template DNA on both ends of the synthesized fragment are main- 
tained to insure proper hybridization of the primer to the template. The oligonu- 
cleotides should avoid sequences longer than five base pairs A+T or G+C. Codons 
used in the replacement of wild-type codons should preferably avoid the TA or CG 

15 doublet wherever possible. Codons are selected from a plant preferred codon table 
(such as Table 6 below) so as to avoid codons which are rarely found in plant 
genomes, and efforts should be made to select codons to preferably adjust the G+C 
content to about 50%. 
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Table 6 

Preferred Codon Usage In Plants 

Amino Acid Codon Percent Usage 

in Plants 

ARG CGA T 

CGC 11 

CGG 5 

CGU 25 

AGA 29 

AGG 23 

LEU CUA 8 

CUC 20 

CUG 10 

CUU 28 

UUA 5 

UUG 30 

SER UCA 14 

UCC 26 

UCG 3 

UCU 21 

AGC 21 

AGU 15 

THR AC A 21 

ACC 41 

ACG 7 

ACU 31 

PRO CCA 45 

CCC 19 

CCG 9 
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Table 6 (Cont'd) 



Amino Acid Codon Percent Usage 

in Plants 

CCU '. 26~ 

ALA GCA 23 

GCC 32 

GCG 3 

GCU 41 

GLY GGA 32 

GGC 20 

GGG 11 

GGU 37 

ILE AUA 12 

AUC 45 

AUU 43 

VAL GUA 9 

GUC 20 

GUG 28 

GUU 43 

LYS AAA 36 

AAG 64 

ASN AAC 72 

AAU 28 

GLN CAA 64 

CAG 36 

HIS CAC 65 

CAU 35 

GLU GAA 48 

GAG 52 

ASP GAC 48 

GAU 52 



TYR UAC 
UAU 



68 

32 
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Table 6 (Cont'd) 



A * A * J 

Ammo Acid 


Codon 


Percent Usage 
in Plants 


CYS 


UGC 


78 




UGU 


22 


PHE 


UUC 


56 




UUU 


44 


MET 


AUG 


100 


TRP 


UGG 


100 



Regions with many consecutive A+T bases or G+C bases are predicted to 
have a higher likelihood to form hairpin structures due to self-complementarity. 
5 Disruption of these regions by the insertion of heterogeneous base pairs is pre- 
ferred and should reduce the likelihood of the formation of self-complementary 
secondary structures such as hairpins which are known in some organisms to in- 
hibit transcription (transcriptional terminators) and translation (attenuators). 

Alternatively, a completely synthetic gene for a given amino acid sequence 

1 0 can be prepared, with regions of five or more consecutive A+T or G+C nucleotides 
being avoided. Codons are selected avoiding the TA and CG doublets in codons 
whenever possible. Codon usage can be normalized against a plant preferred 
codon usage table (such as Table 6) and the G+C content preferably adjusted to 
about 50%. The resulting sequence should be examined to ensure that there are 

1 5 minimal putative plant polyadenylation signals and ATTTA sequences. Restriction 
sites found in commonly used cloning vectors are also preferably avoided. How- 
ever, placement of several unique restriction sites throughout the gene is useful for 
analysis of gene expression or construction of gene variants. 

20 4.1L6 "Plantized" Gene Constructs 

The expression of a plant gene which exists in double-stranded DNA form 
involves transcription of messenger RNA (mRNA) from one strand of the DNA by 
RNA polymerase enzyme, and the subsequent processing of the mRNA primary 
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transcript inside the nucleus. This processing involves a 3' non-translated region 
which adds polyadenylate nucleotides to the 3' end of the RNA. Transcription of 
DNA into mRNA is regulated by a region of DNA usually referred to as the 
"promoter." The promoter region contains a sequence of bases that signals RNA 
5 polymerase to associate with the DNA and to initiate the transcription of mRNA 
using one of the DNA strands as a template to make a corresponding strand of 
RNA. 

A number of promoters which are active in plant cells have been described 
in the literature. These include the nopaline synthase (NOS) and octopine synthase 
10 (OCS) promoters (which are carried on tumor-inducing plasmids of Agrobacterium 
tumefaciens), the Cauliflower Mosaic Virus (CaMV) 19S and 35S promoters, the 
light-inducible promoter from the small subunit of ribulose bis-phosphate carboxy- 
lase (ssRUBISCO, a very abundant plant polypeptide) and the mannopine synthase 
(MAS) promoter (Velten et aL, 1984 and Velten and Schell, 1985). All of these 
15 promoters have been used to create various types of DNA constructs which have 
been expressed in plants (see e.g., Int. Pat. Appl. Publ. No. WO 84/02913). 

Promoters which are known or are found to cause transcription of RNA in 
plant cells can be used in the present invention. Such promoters may be obtained 
from plants or plant viruses and include, but are not limited to, the CaMV35S pro- 
moter and promoters isolated from plant genes such as ssRUBISCO genes. As de- 
scribed below, it is preferred that the particular promoter selected should be capa- 
ble of causing sufficient expression to result in the production of an effective 
amount of protein. 

The promoters used in the DNA constructs (i.e. chimeric plant genes) of the 
present invention may be modified, if desired, to affect their control characteristics. 
For example, the CaMV35S promoter may be ligated to the portion of the ssRU- 
BISCO gene that represses the expression of ssRUBISCO in the absence of light, 
to create a promoter which is active in leaves but not in roots. The resulting chi- 
meric promoter may be used as described herein. For purposes of this description, 
the phrase "CaMV35S" promoter thus includes variations of CaMV35S promoter, 
e.g., promoters derived by means of ligation with operator regions, random or con- 
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trolled mutagenesis, etc. Furthermore, the promoters may be altered to contain 
multiple "enhancer sequences" to assist in elevating gene expression. 

The RNA produced by a DNA construct of the present invention also con- 
tains a 5' non-translated leader sequence. This sequence can be derived from the 
5 promoter selected to express the gene, and can be specifically modified so as to 
increase translation of the mRNA. The 5' non-translated regions can also be ob- 
tained from viral RNA's, from suitable eukaryotic genes, or from a synthetic gene 
sequence. The present invention is not limited to constructs, as presented in the 
following examples. Rather, the non-translated leader sequence can be part of the 

10 5' end of the non-translated region of the coding sequence for the virus coat pro- 
tein, or part of the promoter sequence, or can be derived from an unrelated pro- 
moter or coding sequence. In any case, it is preferred that the sequence flanking 
the initiation site conform to the translational consensus sequence rules for en- 
hanced translation initiation reported by Kozak (1984). 

15 The cry DNA constructs of the present invention may also contain one or 

more modified or fully-synthetic structural coding sequences which have been 
changed to enhance the performance of the cry gene in plants. The structural genes 
of the present invention may optionally encode a fusion protein comprising an 
amino-terminal chloroplast transit peptide or secretory signal sequence. 

20 The DNA construct also contains a 3' non-translated region. The 3' non- 

translated region contains a polyadenylation signal which functions in plants to 
cause the addition of polyadenylate nucleotides to the 3' end of the viral RNA. Ex- 
amples of suitable 3' regions are (1) the 3' transcribed, non-translated regions con- 
taining the polyadenylation signal of Agrobacterium tumor-inducing (Ti) plasmid 

25 genes, such as the nopaline synthase (NOS) gene, and (2) plant genes like the soy- 
bean storage protein (7S) genes and the small subunit of the RuBP carboxylase 
(E9) gene. 

4.12 Methods for Producing Insect-Resistant Transgenic Plants 

30 By transforming a suitable host cell, such as a plant cell, with a recombi- 

nant cry* gene-containing segment, the expression of the encoded crystal protein 
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(/.e, a bacterial crystal protein or polypeptide having insecticidal activity against 
coleopterans) can result in the formation of insect-resistant plants. 

By way of example, one may utilize an expression vector containing a 
coding "region for a B, (huringiensis crystal protein and an appropriate selectable 
5 marker to transform a suspension of embryonic plant cells, such as wheat or corn 
cells using a method such as particle bombardment (Maddock et aL, 1991; Vasil et 
aL, 1992) to deliver the DNA coated on microprojectiles into the recipient cells. 
Transgenic plants are then regenerated from transformed embryonic calli that ex- 
press the insecticidal proteins. 

1 0 The formation of transgenic plants may also be accomplished using other 

methods of cell transformation which are known in the art such as Agrobacterium- 
mediated DNA transfer (Fraley et aL, 1983). Alternatively, DNA can be intro- 
duced into plants by direct DNA transfer into pollen (Zhou et aL, 1983; Hess, 
1987; Luo et aL, 1988), by injection of the DNA into reproductive organs of a 

15 plant (Pena et aL, 1987), or by direct injection of DNA into the cells of immature 
embryos followed by the rehydration of desiccated embryos (Neuhaus et aL, 1987; 
Benbrooke/a/., 1986). 

The regeneration, development, and cultivation of plants from single plant 
protoplast transformants or from various transformed explants is well known in the 

20 art (Weissbach and Weissbach, 1988). This regeneration and growth process typi- 
cally includes the steps of selection of transformed cells, culturing those individual- 
ized cells through the usual stages of embryonic development through the rooted 
plantlet stage. Transgenic embryos and seeds are similarly regenerated. The re- 
sulting transgenic rooted shoots are thereafter planted in an appropriate plant 

25 growth medium such as soil. 

The development or regeneration of plants containing the foreign, exoge- 
nous gene that encodes a polypeptide of interest introduced by Agrobacterium from 
leaf explants can be achieved by methods well known in the art such as described 
(Horsch et aL, 1985). In this procedure, transformants are cultured in the presence 

30 of a selection agent and in a medium that induces the regeneration of shoots in the 
plant strain being transformed as described (Fraley et aL 9 1983). 
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This procedure typically produces shoots within two to four months and 
those shoots are then transferred to an appropriate root-inducing medium contain- 
ing the selective agent and an antibiotic to prevent bacterial growth. Shoots that 
rooted in the presence of the selective agent to form plantlets are then transplanted 
5 to soil or other media to allow the production of roots. These procedures vary de- 
pending upon the particular plant strain employed, such variations being well 
known in the art. 

Preferably, the regenerated plants are self-pollinated to provide homozy- 
gous transgenic plants, as discussed before. Otherwise, pollen obtained from the 
10 regenerated plants is crossed to seed-grown plants of agronomically important, 
preferably inbred lines. Conversely, pollen from plants of those important lines is 
used to pollinate regenerated plants. A transgenic plant of the present invention 
containing a desired polypeptide is cultivated using methods well known to one 
skilled in the art. 

15 Such plants can form germ cells and transmit the transformed trait(s) to 

progeny plants. Likewise, transgenic plants can be grown in the normal manner 
and crossed with plants that have the same transformed hereditary factors or other 
hereditary factors. The resulting hybrid individuals have the corresponding pheno- 
typic properties. A transgenic plant of this invention thus has an increased amount 

20 of a coding region (e.g., a mutated cry gene) that encodes the mutated Cry 
polypeptide of interest. A preferred transgenic plant is an independent segregant 
and can transmit that gene and its activity to its progeny. A more preferred trans- 
genic plant is homozygous for that gene, and transmits that gene to all of its off- 
spring on sexual mating. 

25 Seed from a transgenic plant may be grown in the field or greenhouse, and 

resulting sexually mature transgenic plants are self-pollinated to generate true 
breeding plants. The progeny from these plants become true breeding lines that are 
evaluated for, by way of example, increased insecticidal capacity against coleop- 
teran insects, preferably in the field, under a range of environmental conditions. 

30 The inventors contemplate that the present invention will find particular utility in 
the creation of transgenic plants of commercial interest including various grasses, 



WO 99/31 248 PCT/US98/26852 

116 

grains, fibers, tubers, legumes, ornamental plants, cacti, succulents, fruits, berries, 
and vegetables, as well as a number of nut- and fruit-bearing trees and plants. 

4.13 Methods for Producing Combinatorial Cry3* Variants 

5 Crystal protein mutants containing substitutions in one or more domains 

may be constructed via a number of techniques. For instance, sequences of highly 
related genes can be readily shuffled using the PCR™-based technique described 
by Stemmer (1994). Alternatively, if suitable restriction sites are available, the 
mutations of one cry gene may be combined with the mutations of a second cry 

10 gene by routine subcloning methodologies. If a suitable restriction site is not 
available, one may be generated by oligonucleotide directed mutagenesis using any 
number of procedures known to those skilled in the art. Alternatively, splice- 
overlap extension PCR™ (Horton et aL, 1989) may be used to combine mutations 
in different regions of a crystal protein. In this procedure, overlapping DNA frag- 

15 ments generated by the PCR™ and containing different mutations within their 
unique sequences may be annealed and used as a template for amplification using 
flanking primers to generate a hybrid gene sequence. Finally, cry* mutants may be 
combined by simply using one cry mutant as a template for oligonucleotide- 
directed mutagenesis using any number of protocols such as those described 

20 herein. 

4.14 Isolating Homologous Gene and Gene Fragments 

The genes and 5-endotoxins according to the subject invention include not 
only the full length sequences disclosed herein but also fragments of these se- 
25 quences, or fusion proteins, which retain the characteristic insecticidal activity of 
the sequences specifically exemplified herein. 

It should be apparent to a person skill in this art that insecticidal 
5-endotoxins can be identified and obtained through several means. The specific 
genes, or portions thereof, may be obtained from a culture depository, or con- 
30 structed synthetically, for example, by use of a gene machine. Variations of these 
genes may be readily constructed using standard techniques for making point mu- 
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tations. Also, fragments of these genes can be made using commercially available 
exonucleases or endonucleases according to standard procedures. For example, 
enzymes such as BaB 1 or site-directed mutagenesis can be used to systematically 
cut off nucleotides from the ends of these genes. Also, genes which code for active 
5 fragments may be obtained using a variety of other restriction enzymes. Proteases 
may be used to directly obtain active fragments of these 5-endotoxins. 

Equivalent 5-endotoxins and/or genes encoding these equivalent 
5-endotoxins can also be isolated from Bacillus strains and/or DNA libraries using 
the teachings provided herein. For example, antibodies to the 5-cndotoxins dis- 

10 closed and claimed herein can be used to identify and isolate other 5-endotoxins 
from a mixture of proteins. Specifically, antibodies may be raised to the portions 
of the 5-endotoxins which are most constant and most distinct from other 
B. thuhngiensis 6-endotoxins. These antibodies can then be used to specifically 
identify equivalent 5-endotoxins with the characteristic insecticidal activity by 

15 immunoprecipitation, enzyme linked immunoassay (ELISA), or Western blotting. 

A further method for identifying the 5-endotoxins and genes of the subject 
invention is through the use of oligonucleotide probes. These probes are nucleo- 
tide sequences having a detectable label. As is well known in the art, if the probe 
molecule and nucleic acid sample hybridize by forming a strong bond between the 

20 two molecules, it can be reasonably assumed that the probe and sample are essen- 
tially identical. The probe's detectable label provides a means for determining in a 
known manner whether hybridization has occurred. Such a probe analysis pro- 
vides a rapid method for identifying formicidal 5-endotoxin genes of the subject 
invention. 

25 The nucleotide segments which are used as probes according to the inven- 

tion can be synthesized by use of DNA synthesizers using standard procedures. In 
the use of the nucleotide segments as probes, the particular probe is labeled with 
any suitable label known to those skilled in the art, including radioactive and non- 
radioactive labels. Typical radioactive labels include 32 P, X2 \ 3y S, or the like. A 

30 probe labeled with a radioactive isotope can be constructed from a nucleotide se- 
quence complementary to the DNA sample by a conventional nick translation re- 
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action, using a DNase and DNA polymerase. The probe and sample can then be 
combined in a hybridization buffer solution and held at an appropriate temperature 
until annealing occurs. Thereafter, the membrane is washed free of extraneous 
materials, leaving the sample and bound probe molecules typically detected and 
5 quantified by autoradiography and/or liquid scintillation counting. 

Non-radioactive labels include, for example, ligands such as biotin or thy- 
roxine, as well as enzymes such as hydrolases or peroxidases, or the various 
chemiluminescers such as luciferin, or fluorescent compounds like fluorescein and 
its derivatives. The probe may also be labeled at both ends with different types of 
10 labels for ease of separation, as, for example, by using an isotopic label at the end 
mentioned above and a biotin label at the other end. 

Duplex formation and stability depend on substantial complementarity be- 
tween the two strands of a hybrid, and, as noted above, a certain degree of mis- 
match can be tolerated. Therefore, the probes of the subject invention include mu- 
tations (both single and multiple), deletions, insertions of the described sequences, 
and combinations thereof, wherein said mutations, insertions and deletions permit 
formation of stable hybrids with the target polynucleotide of interest. Mutations, 
insertions, and deletions can be produced in a given polynucleotide sequence in 
many ways, by methods currently known to an ordinarily skilled artisan, and per- 
haps by other methods which may become known in the future. 

The potential variations in the probes listed is due, in part, to the redun- 
dancy of the genetic code. Because of the redundancy of the genetic code, i.e., 
more than one coding nucleotide triplet (codon) can be used for most of the amino 
acids used to make proteins. Therefore different nucleotide sequences can code for 
a particular amino acid. Thus, the amino acid sequences of the B. thuringiensis 
8-endotoxins and peptides can be prepared by equivalent nucleotide sequences en- 
coding the same amino acid sequence of the protein or peptide. Accordingly, the 
subject invention includes such equivalent nucleotide sequences. Also, inverse or 
complement sequences are an aspect of the subject invention and can be readily 
used by a person skilled in this art. In addition it has been shown that proteins of 
identified structure and function may be constructed by changing the amino acid 
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sequence if such changes do not alter the protein secondary structure (Kaiser and 
Kezdy, 1984). Thus, the subject invention includes mutants of the amino acid se- 
quence depicted herein which do not alter the protein secondary structure, or if the 
structure is altered, the biological activity is substantially retained. Further, the 
5 invention also includes mutants of organisms hosting all or part of a 5-endotoxin 
encoding a gene of the invention. Such mutants can be made by techniques well 
known to persons skilled in the art. For example, UV irradiation can be used to 
prepare mutants of host organisms. Likewise, such mutants may include asporo- 
genous host cells which also can be prepared by procedures well known in the art. 

10 

4.15 RlBOZYMES 

Ribozymes are enzymatic RNA molecules which cleave particular mRNA 
species. In certain embodiments, the inventors contemplate the selection and utili- 
zation of ribozymes capable of cleaving the RNA segments of the present inven- 
1 5 tion, and their use to reduce activity of target mRNAs in particular cell types or 
tissues. 

Six basic varieties of naturally-occurring enzymatic RNAs are known pres- 
ently. Each can catalyze the hydrolysis of RNA phosphodiester bonds in tram 
(and thus can cleave other RNA molecules) under physiological conditions. In 

20 general, enzymatic nucleic acids act by first binding to a target RNA. Such bind- 
ing occurs through the target binding portion of a enzymatic nucleic acid which is 
held in close proximity to an enzymatic portion of the molecule that acts to cleave 
the target RNA. Thus, the enzymatic nucleic acid first recognizes and then binds a 
target RNA through complementary base-pairing, and once bound to the correct 

25 site, acts enzymatically to cut the target RNA. Strategic cleavage of such a target 
RNA will destroy its ability to direct synthesis of an encoded protein. After an en- 
zymatic nucleic acid has bound and cleaved its RNA target, it is released from that 
RNA to search for another target and can repeatedly bind and cleave new targets. 

The enzymatic nature of a ribozyme is advantageous over many technolo- 

30 gies, such as antisense technology (where a nucleic acid molecule simply binds to a 
nucleic acid target to block its translation) since the concentration of ribozyme 
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necessary to affect a therapeutic treatment is lower than that of an antisense oli- 
gonucleotide. This advantage reflects the ability of the ribozyme to act enzymati- 
caily. Thus, a single ribozyme molecule is able to cleave many molecules of target 
RNA. In addition, the ribozyme is a highly specific inhibitor, with the specificity 
5 of inhibition depending not only on the base pairing mechanism of binding to the 
target RNA, but also on the mechanism of target RNA cleavage. Single mis- 
matches, or base-substitutions, near the site of cleavage can completely eliminate 
catalytic activity of a ribozyme. Similar mismatches in antisense molecules do not 
prevent their action (Woolf etal, 1992). Thus, the specificity of action of a ri- 
10 bozyme is greater than that of an antisense oligonucleotide binding the same RNA 
site. 

The enzymatic nucleic acid molecule may be formed in a hammerhead, 
hairpin, a hepatitis 5 virus, group I intron or RNaseP RNA (in association with an 
RNA guide sequence) or Neurospora VS RNA motif. Examples of hammerhead 

15 motifs are described by Rossi etal (1992); examples of haixpin motifs are de- 
scribed by Hampel etal. (Eur. Pat. EP 0360257), Hampel and Tritz (1989), Ham- 
pel et al (1990) and Cech et al (U. S. Patent 5,631,359; an example of the hepati- 
tis 5 virus motif is described by Perrotta and Been (1992); an example of the 
RNaseP motif is described by Guerricr-Takada et al (1983); Neurospora VS RNA 

20 ribozyme motif is described by Collins (Saville and Collins, 1990; Saville and 
Collins, 1991; Collins and Olive, 1993); and an example of the Group 1 intron is 
described by Cech et al (U.S. Patent 4,987,071). All that is important in an enzy- 
matic nucleic acid molecule of this invention is that it has a specific substrate 
binding site which is complementary to one or more of the target gene RNA re- 

25 gions, and that it have nucleotide sequences within or surrounding that substrate 
binding site which impart an RNA cleaving activity to the molecule. Thus the ri- 
bozyme constructs need not be limited to specific motifs mentioned herein. 

The invention provides a method for producing a class of enzymatic cleav- 
ing agents which exhibit a high degree of specificity for the RNA of a desired tar- 

30 get. The enzymatic nucleic acid molecule is preferably targeted to a highly con- 
served sequence region of a target mRNA such that specific treatment of a disease 
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or condition can be provided with either one or several enzymatic nucleic acids. 
Such enzymatic nucleic acid molecules can be delivered exogenously to specific 
cells as required. Alternatively, the ribozymes can be expressed from DNA or 
RNA vectors that are delivered to specific cells. 
5 Small enzymatic nucleic acid motifs (e.g. , of the hammerhead or the hairpin 

structure) may be used for exogenous delivery. The simple structure of these 
molecules increases the ability of the enzymatic nucleic acid to invade targeted re- 
gions of the mRNA structure. Alternatively, catalytic RNA molecules can be ex- 
pressed within cells from eukaryotic promoters {e.g., Scanlon etal, 1991; Ka- 
10 shani-Sabet etal., 1992; Dropuiic etal, 1992; Weerasinghe etal, 1991; Ojwang 
et al, 1992; Chen etal, 1992; Sarver etal, 1990). Those skilled in the art realize 
that any ribozyme can be expressed in eukaryotic cells from the appropriate DNA 
vector. The activity of such ribozymes can be augmented by their release from the 
primary transcript by a second ribozyme (Draper etal, Int. Pat. Appl. Publ. No. 
15 WO 93/23569, and Sullivan etal, Int. Pat. Appl. Publ. No. WO 94/02595, both 
hereby incorporated in their totality by reference herein; Ohkawa et al, 1992; Taira 
et al, 1991; Ventura et al , 1 993). 

Ribozymes may be added directly, or can be complexed with cationic lip- 
ids, lipid complexes, packaged within liposomes, or otherwise delivered to target 
cells. The RNA or RNA complexes can be locally administered to relevant tissues 
ex vivo, or in vivo through injection, aerosol inhalation, infusion pump or stent, 
with or without their incorporation in biopolymers. 

Ribozymes may be designed as described in Draper et al (Int. Pat. Appl. 
Publ. No. WO 93/23569), or Sullivan et al, (Int. Pat Appl. Publ. No. WO 
94/02595) and synthesized to be tested in vitro and in vivo, as described. Such ri- 
bozymes can also be optimized for delivery. While specific examples are pro- 
vided, those in the art will recognize that equivalent RNA targets in other species 
can be utilized when necessary. 

Hammerhead or hairpin ribozymes may be individually analyzed by com- 
puter folding (Jaeger etal, 1989) to assess whether the ribozyme sequences fold 
into the appropriate secondary structure. Those ribozymes with unfavorable intra- 
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molecular interactions between the binding arms and the catalytic core are elimi- 
nated from consideration. Varying binding arm lengths can be chosen to optimize 
activity. Generally, at least 5 bases on each arm are able to bind to, or otherwise 
interact with, the target RNA. 
5 Ribozymes of the hammerhead or hairpin motif may be designed to anneal 

to various sites in the mRNA message, and can be chemically synthesized. The 
method of synthesis used follows the procedure for normal RNA synthesis as de- 
scribed in Usman etal (1987) and in Scaringe etal (1990) and makes use of 
common nucleic acid protecting and coupling groups, such as dimethoxytrityl at 

10 the 5'-end, and phosphoramidites at the 3'-end. Average stepwise coupling yields 
are typically >98%. Hairpin ribozymes may be synthesized in two parts and an- 
nealed to reconstruct an active ribozyme (Chowrira and Burke, 1992). Ribozymes 
may be modified extensively to enhance stability by modification with nuclease 
resistant groups, for example, 2'-amino, 2'-C-allyl, 2'-flouro, 2'-o-methyl, 2'-H (for 

15 a review see Usman and Cedergren, 1992). Ribozymes may be purified by gel 
electrophoresis using general methods or by high pressure liquid chromatography 
and resuspended in water. 

Ribozyme activity can be optimized by altering the length of the ribozyme 
binding arms, or chemically synthesizing ribozymes with modifications that pre- 

20 vent their degradation by serum ribonucleases (see e.g., Int. Pat Appl. Publ. No. 
WO 92/07065; Perrauit et al, 1990; Pieken etal., 1991; Usman and Cedergren, 
1992; Int. Pat. Appl. Publ. No. WO 93/15187; Int. Pat. Appl. Publ. No. WO 
91/03162; Eun Pat. Appl. Publ. No. 92110298.4; U.S. Patent 5,334,71 1; and Int. 
Pat. Appl. Publ. No. WO 94/13688, which describe various chemical modifications 

25 that can be made to the sugar moieties of enzymatic RNA molecules), modifica- 
tions which enhance their efficacy in cells, and removal of stem II bases to shorten 
RNA synthesis times and reduce chemical requirements. 

Sullivan et al (Int. Pat Appl. Publ. No. WO 94/02595) describes the gen- 
eral methods for delivery of enzymatic RNA molecules. Ribozymes may be ad- 

30 ministered to cells by a variety of methods known to those familiar to the art, in- 
cluding, but not restricted to, encapsulation in liposomes, by iontophoresis, or by 
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incorporation into other vehicles, such as hydrogels, cyclodextrins, biodegradable 
nanocapsules, and bioadhesive microspheres. For some indications, ribozymes 
may be directly delivered ex vivo to cells or tissues with or without the aforemen- 
tioned vehicles. Alternatively, the RNA/vehicle combination may be locally deliv- 
5 ered by direct inhalation, by direct injection or by use of a catheter, infusion pump 
or stent. Other routes of delivery include, but are not limited to, intravascular, in- 
tramuscular, subcutaneous or joint injection, aerosol inhalation, oral (tablet or pill 
form), topical, systemic, ocular, intraperitoneal and/or intrathecal delivery. More 
detailed descriptions of ribozyme delivery and administration are provided in Sul- 
10 livan etal (Int. Pat. Appl. Publ. No. WO 94/02595) and Draper et al (Int. Pat. 
Appl. Publ. No. WO 93/23569) which have been incorporated by reference herein. 

Another means of accumulating high concentrations of a ribozymc(s) 
within cells is to incorporate the ribozyme-encoding sequences into a DNA ex- 
pression vector. Transcription of the ribozyme sequences are driven from a pro- 
1 5 moter for eukaryotic RNA polymerase I (pol I), RNA polymerase II (pol II), or 
RNA polymerase III (pol III). Transcripts from pol II or pol III promoters will be 
expressed at high levels in all cells; the levels of a given pol II promoter in a given 
cell type will depend on the nature of the gene regulatory sequences (enhancers, 
silencers, etc.) present nearby. Prokaryotic RNA polymerase promoters may also 
be used, providing that the prokaryotic RNA polymerase enzyme is expressed in 
the appropriate cells (Elroy-Stein and Moss, 1990; Gao and Huang, 1993; Lieber 
etal., 1993; Zhou etal, 1990). Ribozymes expressed from such promoters can 
.function in mammalian cells {e.g. Kashani-Saber et al , 1 992; Oj wang et al , 1 992; 
Chen etal, 1992; Yu etal, 1993; L'Huillier etal, 1992; Lisziewicz etal, 1993). 
Such transcription units can be incorporated into a variety of vectors for introduc- 
tion into mammalian cells, including but not restricted to, plasmid DNA vectors, 
viral DNA vectors (such as adenovirus or adeno-associated vectors), or viral RNA 
vectors (such as retroviral, semliki forest virus, sindbis virus vectors). 

Ribozymes of this invention may be used as diagnostic tools to examine 
genetic drift and mutations within cell lines or cell types. They can also be used to 
assess levels of the target RNA molecule. The close relationship between ri- 
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bozyme activity and the structure of the target RNA allows the detection of muta- 
tions in any region of the molecule which alters the base-pairing and three- 
dimensional structure of the target RNA. By using multiple ribozymes described 
in this invention, one may map nucleotide changes which are important to RNA 
5 structure and function in vitro, as well as in cells and tissues. Cleavage of target 
RNAs with ribozymes may be used to inhibit gene expression and define the role 
(essentially) of specified gene products in particular cells or cell types. 

5.0 Examples 

10 The following examples are included to demonstrate preferred embodi- 

ments of the invention. It should be appreciated by those of skill in the art that the 
techniques disclosed in the examples which follow represent techniques discovered 
by the inventor to function well in the practice of the invention, and thus can be 
considered to constitute preferred modes for its practice. However, those of skill in 

15 the art should, in light of the present disclosure, appreciate that many changes can 
be made in the specific embodiments which are disclosed and still obtain a like or 
similar result without departing from the spirit and scope of the invention. 

5.1 Example 1 - Three-Dimension al Structure of Cry3Bb 

20 The three-dimensional structure of Cry3Bb was determined by X-ray crys- 

tallography. Crystallization of Cry3Bb and X-ray diffraction data collection were 
performed as described by Cody et al (1 992). The crystal structure of Cry3Bb was 
refined to a residual R factor of 18.0% using data collected to 2.4 A resolution. 
The crystals belong to the space group C222, with unit cell dimensions a = 122.44, 

25 b = 131.81, and c » 105.37 A and contain one molecule in the asymmetric unit. 
Atomic coordinates for Cry3Bb are described in Example 3 1 and listed in Section 
9. 

The structure of Cry3Bb is similar to that of Cry3A (Li et al, 1991). It 
consists of 5825 protein atoms from 588 residues (amino acids 64 - 652) forming 
30 three discrete domains (FIG. 1). A total of 251 water molecules have been identi- 
fied in the Cry3Bb structure (FIG. 2). Domain 1 (residues 64 - 294) is a seven 
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helical bundle formed by six helices twisted around the central helix, <x5 (FIG. 3). 
The amino acids forming each helix are listed in FIG. 4. Domain 2 (residues 295 - 
502) contains three antiparallel P-sheets (FIG. 5A and FIG. 5B). Sheets 1 and 2, 
each composed of 4 p strands, form the distinctive "Greek key" motif. The outer 
5 surface of sheet 3, composed of 3 p strands, makes contact with helix a7 of do- 
main 1. FIG. 6 lists the amino acids comprising each p strand in domain 2. A 
small a helix, a8 which follows p strand 1, is also included in domain 2. Domain 
3 (residues 503 - 652) has a "jelly roll" p-bairel topology which has a hydrophobic 
core and is nearly parallel to the a and perpendicular to the c axes of the lattice 
1 0 (FIG. 7A and FIG. 7B). The amino acids comprising each p strand of domain 3 are 
listed in FIG. 8. 

The monomers of Cry3Bb in the crystal form a dimeric quaternary structure 
along a two-fold axis parallel to the a axis (FIG. 9A and FIG. 9B). Helix a6 lies in 
a cleft formed by the interface of domain 1 and domains 1 and 3 of its symmetry 
15 related molecule. There are numerous close hydrogen bonding contacts along this 
surface, confirming the structural stability of the dimer. 



5.2 Example 2 - Preparation of Cry3Bb.60 

B. thuringiensis EG7231 was grown through sporulation in C2 medium 
20 with chloramphenicol (Cml) selection. The solids from this culture were recovered 
by centrifugation and washed with water. The toxin was purified by recrystalliza- 
tion from 4.0 M NaBr (Cody et ai, 1992). The purified Cry3Bb was solubilized 
in 10 ml of 50 mM KOH/100 mg Cry3Bb and buffered to pH 9.0 with 100 mM 
CAPS (pH 9.0). The soluble toxin was treated with trypsin at a weight ratio of 50 
25 mg toxin to 1 mg trypsin. After 20 min of trypsin digestion the predominant pro- 
tein visualized by SDS-polyacrylamide gel electrophoresis (SDS-PAGE) was 60 
kDa. Further digestion of the 60-kDa toxin was not observed. FIG. 4 illustrates 
the Coomassie-stained Cry3Bb and Cry3Bb.60 following SDS-PAGE. 



WO 99/31248 PCT/US98/26852 

126 

53 Example 3 ~ Purification and Sequencing of Cry3Bb.60 

Cry3Bb.60 was electrophoretically purified by SDS-PAGE and electroblot- 
ted to Immobilon-P® (Millipore) membrane by semi-dry transfer at 15V for 30 
min. The membrane was then washed twice with water and stained with 0.025% 
5 R-250, 40% methanol. To reduce the background, the blot was destained with 
50% methanol until the stained protein bands were visible. The blot was then air 
dried, and the stained Cry3Bb.60 band was cut out of the membrane. This band 
was sent to the Tufts University Sequencing Laboratory (Boston, MA) for 
N-terminal sequencing. The experimentally-determined N-terminal amino acid 
10 sequence is shown in Table 7 beside the known amino acid sequence starting at 
amino acid residue 160. 

Table 7 

Amino acid Sequence of the N-Terminus of Cry3Bb.60 and 
1 5 Comparison to the Known Sequence of Cry3Bb 



Deduced Sequence 


Known Sequence 


Residue # 


S 


S 


160 


K 


K 


161 


R 


R 


162 


S 


S 


163 


Q 


Q 


164 


D 


D 


165 


R 


R 


166 



5.4 Example 4 - Bioactivity of Cry3Bb.60 

Cry3Bb was prepared for bioassay by solubilization in a minimal amount of 
50 mM KOH, 10 ml per 100 mg toxin, and buffered to pH 9.0 with 100 mM 
20 CAPS, pH 9.0. Cry3Bb.60 was prepared as described in Example 1 . Both prepa- 
rations were kept at room temperature 12 to 16 hours prior to bioassay. After 
seven days the mortality of the population was determined and analyzed to deter- 
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mine the lethal concentration of each toxin. These results are numerized in Table 
8. 



Table 8 

5 Bioactivity of Cry3Bb and Cry3Bb.60 Against the Southern Corn 
rootworm (diablotica undecimpunctata) 





LC 50 mg/well 


95% C. I. 


Cry3Bb 


24.09 


15-39 


Cry3Bb.60 


6.72 


5.25 - 8.4 



5.5 Example 5 - Ion-Channel Formation by Cry3Bb and CryB2.60 

Cry3Bb.60 and Cry3Bb were evaluated for their ability to form ion chan- 
10 nels in planar lipid bilayers. Bilayers of phosphatidylcholine were formed on Tef- 
lon® supports over a 0.7-mm hole. A bathing solution of 3.5 ml 100 mM KOH, 10 
mM CaCl 2 , 100 mM CAPS (pH 9.5) was placed on either side of the Teflon® par- 
tition. The toxin was added to one side of the partition and a voltage of 60 mV was 
imposed across the phosphatidylcholine bilayer. Any leakage of ions through the 
15 membrane was amplified and recorded. An analysis of the frequency of the con- 
ductances created by either Cry3Bb or Cry3Bb.60 are illustrated in FIG. 5A and 
FIG. 5B. Cry3Bb.60 readily formed ion channels whereas Cry3Bb rarely formed 
channels. 

20 5.6 Example 6 Formation of High Molecular-Weight Oligomers 

Individual molecules of Cry3Bb or Cry3Bb.60 form a complex with an- 
other like molecule. The ability of Cry3Bb to form an oligomer is not reproducibly 
apparent. The complex cannot be repeatedly observed to form under nondenatur- 
ing conditions. Cry3Bb.60 formed a significantly greater amount of a higher mo- 
25 lecular-weight complex (>120 kDa) with other Cry3Bb.60 molecules. Oligomers 
of Cry3Bb are demonstrated by the intensity of the Coomassie-stained SDS poly- 
acrylamide gel. Oligomerization is visualized on SDS-PAGE by not heating sam- 
ples prior to loading on the gel to retain some nondenatured toxin. These data sug- 
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gest that Cry3Bb.60 more readily forms the higher order complex than Cry3Bb 
alone. Oligomerization is also observed by studying the conductance produced by 
these molecules and the time-dependent increase in conductance. This change in 
conductance can be attributed to oligomerization of the toxin. 

5 

5.7 Example 7 - Design Method l: Identification and Alteration of 
Protease-Sensitive Sites and Proteolytic Processing 

It has been reported in the literature that treatment of Cry3A toxin protein 
with trypsin, an enzyme that cleaves proteins on the carboxyl side of available ly- 

10 sine and arginine residues, yields a stable cleavage product of 55 kDa from the 67 
kDa native protein (Carroll et al 9 1989). N-terminal sequencing of the 55 kDa 
product showed cleavage occurs at amino acid residue R158. The truncated Cry3A 
protein was found to retain the same level of insecticidal activity as the native pro- 
tein. Cry3Bb toxin protein was also treated with trypsin. After digestion, the pro- 

15 tein size decreased from 68 kDa, the molecular weight of the native Cry3Bb toxin, 
to 60 kDa. No further digestion was observed. N-terminal sequencing revealed the 
trypsin cleavage site of the truncated toxin (Cry3Bb.60) to be amino acid R159 in 
la3,4 of Cry3Bb. Unexpectedly, the bioactivity of the truncated Cry3Bb toxin was 
found to increase. 

20 Using this method, protease digestion of a B. thuringiensis toxin protein, a 

proteolytically sensitive site was identified on Cry3Bb, and a more highly active 
form of the protein (Cry3Bb.60) was identified. Modifications to this proteolyti- 
cally-sensitive site by introducing an additional protease recognition site also re- 
sulted in the isolation of a biologically more active protein. It is also possible that 

25 removal of other protease-sensitive site(s) may improve activity. Proteolytically 
sensitive regions, once identified, may be modified or utilized to produce biologi- 
cally more active toxins. 
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5.7.1 Cry3Bb.60 

Treatment of solubilized Cry3Bb toxin protein with trypsin results in the 
isolation of a stable, truncated Cry3Bb toxin protein with a molecular weight of 60 
kDa (Cry3Bb.60). N-terminal sequencing of Cry3Bb.60 shows the trypsin- 
5 sensitive site to be R159 in lot3,4 of the native toxin. Trypsin digestion results in 
the removal of helices 1-3 from the native Cry3Bb but also increases the activity of 
the toxin against SCRW larvae approximately four-fold. 

Cry3Bb.60 is a unique toxin with enhanced insecticidal use over the parent 
Cry3Bb. Improved biological activity, is only one parameter that distinguishes it 
10 as a new toxin. Aside from the reduced size, Cry3Bb.60 is also a more soluble 
protein. Cry3Bb precipitates from solution at pH 6.5 while Cry3Bb.60 remains in 
solution from pH 4.5 to pH 12. Cry3Bb.60 also forms ion channels with greater 
frequency than Cry3Bb. 

Cry3Bb.60 is produced by either the proteolytic removal of the first 159 
15 amino acid residues, or the in vivo production of this toxin, by bacteria or plants 
expressing the gene for Cry3Bb.60, that is, the Cry3Bb gene without the first 483 
nucleotides. 

In conclusion, Cry3Bb.60 is distinct from Cry3Bb in several important 
ways: enhanced insecticidal activity; enhanced range of solubility; enhanced ability 
20 to form channels; and reduced size. 

5.7.2 EG11221 

Semi-random mutagenesis of the trypsin-sensitive la3,4 region of Cry3Bb 
resulted in the isolation of Cry3Bb.l 1221, a designed Cry3Bb protein that exhibits 

25 over a 6-fold increase in activity against SCRW larvae compared to WT. 
Cry3Bb.ll221 has 4 amino acid changes in the lct3 5 4 region. One of these 
changes, L158R, introduces an additional trypsin site adjacent to R159, the prote- 
olytically sensitive site used to produce Cry3Bb.60 (example 4.1.1). 
Cry3Bb.l 1221 is produced by B. thuringiensis as a full length toxin protein but is 

30 presumably digested by insect gut proteases to the same size as Cry3Bb.60 (see 
Cry3A results from Carroll et aL, 1989). The additional protease recognition site 
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may make the la3,4 region even more sensitive to digestion, thereby increasing 
activity. 

5.8 Example 8 Design Method 2: Determination and Manipulation 
5 of Bound Water 

There are several ways that water molecules can associate with a protein, 
including surface water that is easily removed and bound water that is more diffi- 
cult to extract (Dunitz, 1994; Zhang and Matthews, 1994). The function of bound 
water has been the subject of significant academic extrapolation, but the precise 
10 function has little experimental validation. Some of the most interesting bound or 
structural water is the water that participates in the protein structure from inside the 
protein itself. 

The occupation of a site by a water molecule can indicate a stable pocket 
within a protein or a looseness of packing created by water-mediated salt bridges 

15 and hydrogen bonding to water. This can reduce the degree of bonding between 
amino acids, possibly making the region more flexible. A different amino acid se- 
quence around that same site could result in better packing, collapsing the pocket 
around polar or charged amino acids. This may result in decreased flexibility. 
Therefore, the degree of hydration of a region of a protein may determine the 

20 flexibility or mobility of that region, and manipulation of the hydration may alter 
the flexibility. Methods of increasing the hydration of a water-exposed region in- 
clude increasing the number of hydrophobic residues along that surface. It is 
taught in the art that exposed hydrophobic residues require significantly more wa- 
ter to hydrate than hydrophilic residues (CRC Handbook of Chemistry and Physics, 

25 CRC Press, Inc.). It is not taught, however, that by doing this, improvements to the 
biological activity of a protein can be achieved. 

Structural water has not previously been identified in B. thuringiensis 5- 
endotoxins including Cry3Bb. Furthermore, there are no reports of the function of 
this structural water in 8-endotoxins or bacterial toxins. In the analysis of CrySBb, 

30 it was observed that a collection of water molecules are located around la3,4, a site 
defined by the inventors as important for improvement of bioactivity. The loop 
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a3,4 region is surface exposed and may define a hinge in the protein permitting 
either removal or movement of the first three helices of domain 1 . The hydration 
found around this region may impart flexibility and mobility to this loop. The ob- 
servation of structural water at the la3,4 site provided an analytical tool for further 
5 structure analysis. If this important site is surrounded by water, then other impor- 
tant sites may also be completely or partially surrounded by water. Using this in- 
sight, structural water surrounding helices 5 and 6 was then identified. This struc- 
tural water forms a column through the protein, effectively separating helices 5 and 
6 from the rest of the molecule. The structures of Cry3 A and Cry3Bb suggest that 
10 helices 5 and 6 are tightly associated, bound together by Van der Waals interac- 
tions. Alone, helix 5 from Cry3A, although insufficient for biological activity, has 
been demonstrated to have the ability to form ion channels in an artificial mem- 
brane (Gazit and Shai, 1993). The ion channels formed by helix 5 are 10-fold 
smaller than the channels of the full length toxin suggesting that significantly more 

15 toxin structure is required for the full-sized ion channels. In Cry3Bb, helix 5 as 
part of a cluster of a helices (domain 1) has been found to form ion channels (Von 
Tersch et ai, 1994), Unpublished experimental observations by the inventors 
demonstrate that helix 6 also crossed the biological membrane. Helices 5 and 6, 
therefore, are the putative channel-forming helices necessary for toxicity. 

20 The hydration around these helices may indicate that flexibility of this re- 

gion is necessary for toxicity. It is conceivable, therefore, that if it were possible to 
improve the hydration around helices 5 and 6, one could create a better toxin pro- 
tein. Care must be taken, however, to avoid creating continuous hydrophobic sur- 
faces between helices 5-6 and any other part of the protein which could, by hydro- 

25 phobic interactions, act to restrict movement of the mobile helices. The mobility 
of helices 5 and 6 may also depend on the flexibility of the loops attached to them 
as well as on other regions of the Cry3Bb molecule, particularly in domain 1, 
which may undergo conformational changes to allow insertion of the 2 helices into 
the membrane. Altering the hydration of these regions of the protein may also af- 

30 feet its bioactivity. 
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5-8.1 CRY3BB.11032 

A collection of bound water residues indicated the relative flexibility of the 
la3,4 region. The flexibility of this loop can be increased by increasing the hydra- 
tion of the region by substituting relatively hydrophobic residues for the exposed 
5 hydrophilic residues. An example of an improved, designed protein having this 
type of substitution is Cry3Bb.ll032. Cry3Bb. 11032 has the amino acid change 
D165G; glycine is more hydrophobic than aspartate (Kyte and Doolittle hydro- 
phobicity score of -0.4 vs. -3.5 for aspartate). Cry 3Bb. 11032 is approximately 3 
times more active than WT Cry3Bb. 

10 

5.8.2 Cky3Bb.11051 

To increase the hydration of the lcc4,5 region of Cry3Bb, glycine was sub- 
stituted for the surface exposed residue K189. Glycine is more hydrophobic than 
lysine (Kyte and Doolittle hydrophobicity score of -0.4 vs. -3.9 for lysine) and may 
15 result in an increase in bound water. The increase in bound water may impart 
greater flexibility to the loop region which precedes the channel-forming helix, a5. 
The designed Cry3Bb protein with the K189G change, Cry3Bb. 11051, exhibits a 
3-fold increase in activity compared to WT Cry3Bb. 

20 5.8.3 ALTERATIONS TO La7,pi (CRY3BB.1 1241 AND 1 1242) 

Amino acid changes made in the surface-exposed loop connecting a-helix 7 
and p-strand 1 (lcc7,pl) resulted in the identification of 2 altered Cry3Bb proteins 
with increased bioactivities, Cry3Bb.ll241 and Cry3Bb.ll242. Analysis of the 
hydropathy index of 2 of these proteins over the 20 amino acid sequence 281-300, 

25 inclusive of the la7,pi region, reveal that the amino acid substitutions in these 
proteins have made the loc7,pi region much more hydrophobic. The grand average 
of hydropathy value (GRAVY) was determined for each protein sequence using the 
PC\GENE® (IntelliGenetics, Inc., Mountain View, CA, release 6.85) protein se- 
quence analysis computer program, SOAP, and a 7 amino acid interval. The 

30 SOAP program is based on the method of Kyte and Doolittle (1982). The increase 
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in hydrophobicity of the la7,pl region for each protein may increase the hydration 
of the loop and, therefore, the flexibility. The altered proteins, their respective 
amino acid changes, fold-increases over WT bioactivity, and GRAVY values are 
listed in Table 9. 

5 

Table 9 

Hydropathy Values for the La7,pi region of Cry3Bb and 2 Designed 



Cry3Bb Proteins Showing Increased SCRW Bioactivity 



Cry3Bb* 


Amino Acid Changes 


Fold Increase in 


GRAVY 


Protein 




Bioactivity 


(Amino Acids 






Over WT 


281-300) 


wildtype 






4.50 


Cry3Bb. 11241 


Y287F, D288N, R290L 


2.6x 


10.70 


Cry3Bb. 11242 


R290V 


2.5x 


8.85 



10 5.8.4 Alterations to Lpi,a8 (Cry3Bb.I1228, Cry3Bb.11229, 
CRY3BB.1 1230, CRY3BB.1 1233, Cry3Bb.1 1236, Cry3Bb, 11237, 
Cry3Bb.H238 and Cry3Bb.H239) 

The surface-exposed loop between p-strand I and a-helix 8 (ipi,a8) de- 
fines the boundary between domains 1 and 2 of Cry3Bb. The introduction of semi- 

15 random amino acid changes to this region resulted in the identification of several 
altered Cry3Bb proteins with increased bioactivity. Hydropathy index analysis of 
the amino acid substitutions found in the altered proteins shows that the changes 
have made the exposed region more hydrophobic which may result in increased 
hydration and flexibility. Table 10 lists the altered proteins, their respective amino 

20 acid changes and fold increases over WT Cry3Bb and the grand average of hydro- 
pathy value (GRAVY) determined using the PC\GENE® (IntelliGenetics, Inc., 
Mountain View, CA, release 6.85) protein sequence analysis program, SOAP, over 
the 20 amino acid sequence 305 - 324 inclusive of lpl,cc8, using a 7 amino acid 
interval. 



25 
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Table 10 

Hydropathy Values for the L|Jl,a8 Region of Cry3Bb and 8 Designed 
Cry3Bb* Proteins Showing Increased SCRW Bioactivity 



frvlfth* 


Am inn A r* i rl 


r uiu intrcdac m 


GRAVY 

vjivn ▼ x 


r roicin 




X>IOaLllYliy ^jrVCl 


( Amino Arifls 






wiia lype 




wildtype 






U.OJ 


LryJBb.l lz2o 


oil IL, N3I3I, 


A 1 . . 

4.1x 






h3I7K 






LryiBb.l illy 


on IT tro 1 71/ 

oil 1 1, til/K, 


r\ c . 

2.5x 


z.ou 




Y31oC 






UryiiJD.l lziu 


oil 1A, LilzV, 


4./X 






Q316W 






Cry3Bb. 11233 


S311A, Q316D 


2.2x 


2.15 


Cry3Bb. 11236 


S3J1I 


3.1x 


3.50 


Cry3Bb. 11237 


S3 111, N313H 


5.4x 


3.65 


Cry3Bb. 11238 


N313V, T314N, 


2.6x 


9.85 




Q316M, E317V 






Cry3Bb.ll239 


N313R, L315P, 


2.8x 


3.95 




Q316L, E317A 







5 5.8.5 CRY3BB.11227, Cry3Bb.1I241 and Cry3Bb.11242 

Amino acid Q238, located in helix 6 of Cry3Bb, has been identified as a 
residue that, by its large size and hydrogen bonding to R290, blocks complete hy- 
dration of the space between helix 6 and helix 4. Substitution of R290 with amino 
acids that do not form hydrogen bonds or that have side chains that can not span 
10 the physical distance to hydrogen bond with Q238 may result in increased hydra- 
tion around Q238. Q238, unable to hydrogen bond to R290, may now bind water. 
This may increase the flexibility of the channel-forming region. Designed proteins 
Cry3Bb.ll227 (R290N), Cry3Bb.ll241 (R290L) and Cry3Bb.ll242 (R290V) 
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show increased activities of approximately 2-fold, 2.6-fold and 2.5-fold 5 respec- 
tively, against SCRW larvae compared to WT. 

5.9 Example 9 - Design Method 3: Manipulation of Hydrogen Bonds 
5 Around Mobile Regions 

Mobility of regions of a protein may be required for activity. The mobility 
of the ct.5,6 region, the putative channel-forming region of Cry3Bb, may be im- 
proved by decreasing the number of hydrogen bonds, including salt bridges 
(hydrogen bonds between oppositely charged amino acid side chains), between 
1 0 helices 5-6 and any other part of the molecule or dimer structure. These hydrogen 
bonds may impede the movement of the two helices. Decreasing the number of 
hydrogen bonds and salt bridges may improve biological activity. Replacement of 
hydrogen-bonding amino acids with hydrophobic residues must be done with cau- 
tion to avoid creating continuous hydrophobic surfaces between helices 5-6 and 
1 5 any other part of the dimer. This may decrease mobility by increasing hydrophobic 
surface interactions. 

5.9.1 CRY3BB.11222 and Cry3Bb.11223 

Tyr230 is located on helix 6 and, in the quaternary dimer structure of 
Cry3Bb, this amino acid is coordinated with Tyr230 from the adjacent molecule. 
Three hydrogen bonds are formed between the two helices 6 in the two monomers 
because of this single amino acid. In order to improve the flexibility of helices 5-6, 
the helices theoretically capable of penetrating the membrane and forming an ion 
channel, the hydrogen bonds across the dimer were removed by changing this 
amino acid and a corresponding increase in biological activity was observed. The 
designed Cry3Bb proteins, Cry3Bb. 11222 and Cry3Bb.EG11223, show a 4-fold 
and 2.8-fold increase in SCRW activity, respectively, compared to WT. 

5.9.2 Cry3Bb.11051 

Designed Cry3Bb protein Cry3Bb.l 1051 has amino acid change K189G in 
la4,5 of domain 1. In the WT Cry3Bb structure, the exposed side chain of K189 is 
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close enough to the exposed side change of El 23, located in loc2b,3, to form hy- 
drogen bonds. Substitution of K 189 with glycine, as found in this position in 
Cry3 A, removes the possibility of hydrogen bond formation at this site and results 
in a protein with a bioactivity three-fold greater than WT Cry3Bb. 

5 

5.9.3 CRY3BB.H227, CRY3BB.11241 AND CRY3BB.11242 

Amino acid Q238, located in helix 6 of Cry3Bb, has been identified as a 
residue that, by its large size and hydrogen bonding to R290, blocks complete hy- 
dration of the space between helix 6 and helix 4. Substitution of R290 with amino 

10 acids that do not form hydrogen bonds or that have side chains that can not span 
the physical distance to hydrogen bond with Q238 may increase the flexibility of 
the channel-forming region. Designed proteins Cry3Bb. 11227 (R290N), 
Cry3Bb.ll241 (R290L) and Cry3Bb.ll242 (R290V) show increased activities of 
approximately 2-fold, 2.6-fold and 2.5-fold, respectively, against SCRW larvae 

1 5 compared to WT 

5,10 Example 10 - Design Method 4: Loop Analysis and Loop Design 
Around Flexible Helices 

Loop regions of a protein structure may be involved in numerous functions 
20 of the protein including, but not limited to, channel formation, quaternary structure 
formation and maintenance, and receptor binding. Cry3Bb is a channel-forming 
protein. The availability of the ion channel-forming helices of 5-endotoxins to 
move into the bilayer depend upon the absence of forces that hinder the process. 
One of the forces possibly limiting this process is the steric hindrance of amino 
25 acid side chains in loop regions around the critical helices. The literature suggests 
that in at least one other bacterial toxin, not a £. thuringiensis toxin, the toxin 
molecule opens up or, in scientific terms, loses some of the quaternary structure to 
expose a membrane-active region (Cramer et aL, 1990). This literature does not 
teach how to improve the probability of this event occurring and it is not known if 
30 B. thuringiensis toxins use this same process to penetrate the membrane. Reducing 
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the steric hindrance of the amino acid side chains in these critical regions by reduc- 
ing size or altering side chain positioning with the corresponding increase in bio- 
logical activity was the inventive step. 

5 5.10.1 Analysis of the Loop Between Helices 3 and 4 (Cry3Bb.11032) 

The inventors have discovered that the first three helices of domain one 
could be cleaved from the rest of the toxin by proteolytic digestion of the loop 
between helices a3 and a4 (Cry3Bb.60). Initial efforts to truncate the cry3Bb gene 
to produce this shortened, though more active Cry3Bb molecule, failed. For un- 

10 known reasons, B. thuringiensis failed to synthesize this 60-kDa molecule. It was 
then reasoned that perhaps the first three helices of domain 1 did not have to be 
proteolytically removed, or equivalently, the protein did not have to be synthesized 
in this truncated form to take advantage of the Cry3Bb.60 design. It was observed 
that the protein Cry 3 A had a small amino acid near the la3,4 that might impart 

1 5 greater flexibility in the loop region thereby permitting the first three helices of 
domain 1 to move out of the way, exposing the membrane-active region. By de- 
signing a Cry3Bb molecule with a glycine residue near this loop, the steric hin- 
drance of residues in the loop might be lessened. The redesigned protein, 
Cry3Bb.l 1032, has the amino acid change D165G, which replaces the larger aspar- 

20 tate residue (average mass of 115.09) with the smallest amino acid, glycine 
(average mass of 57.05). The activity of Cry3Bb. 1 1 032 is approximately 3-fold 
greater than that of the WT protein. In this way, the loop between helices cc3 and 
a4 was rationally redesigned with a corresponding increase in the biological activ- 
ity. 

25 

5.10.2 CRY3BB.11051 

The loop region connecting helices a4 and a5 in Cry3Bb must be flexible 
so that the channel-forming helices a5-a6 can penetrate into the membrane. It was 
noticed that Cry3A has a glycine residue in the middle of this loop that may impart 
30 greater flexibility. The corresponding change, K189G, was made in Cry3Bb and 
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the resulting, designed protein, Cry 3 Bb.l 1051, exhibits a 3-fold increase in activity 
against SCRW larvae compare to WT Cry3Bb. 

5.10.3 Analysis of the Loop Between P-Strand 1 and Helix 8 
5 (CRY3BB.11228, Cry3Bb.11229, Cry3Bb.1 1230, Cry3Bb.1 1232, 

CRY3BB.11233, CRY3BB.11236, CRY3BB.11237, Cry3Bb.11238, and 
CRY3BB.11239) 

The loop region located between p strand 1 of domain 2 and a helix 8 in 
domain 2 is very close to the loop between a helices 6 and 7 in domain 1 . Some of 

10 the amino acids side chains of lpi ,a8 appear as though they may sterically impede 
movement of la6,7. Since \a6J must be flexible for the channel-forming helices 
a5-<x6 to insert into the membrane, it was thought that re-engineering this loop 
may change the positioning of the side chains resulting in less steric hindrance. 
This was accomplished creating proteins with increased biological activities rang- 

15 ing from 2.2 to 5.4 times greater than WT. These designed toxin proteins and their 
amino acid changes are listed in Table 2 as Cry3Bb.l 1228, Cry3Bb.ll229, 
Cry3Bb.ll230, Cry3Bb.l 1232, Cry3Bb.ll233, Cry3Bb.ll236, Cry3Bb.l 1237, 
Cry3Bb.ll238, and Cry3Bb.l 1239. 

20 5*10.4 Analysis of the Loop Between Helix 7 and P-Strand 1 
(CRY3BB.11227, CRY3BB.11234, Cry3Bb.11241, Cry3Bb.11242, and 
CRY3BB.11036) 

If Cry3Bb is similar to a bacterial toxin which must open up to expose a 
membrane active region for toxicity, it is possible that other helices in addition to 

25 the channel-forming helices must also change positions. It was reasoned that, if 
helices <x5-a6 insert into the membrane, than helix a7 may have to change posi- 
tions also. It was shown in example 4.43 that increasing flexibility between helix 
a6 and a7 can increase activity, greater flexibility in the loop following helix a7, 
lot7,pi may also increase bioactivity. Alterations to the Ioc7,pi region of Cry3Bb 

30 resulted in the isolation of several proteins with increased activities ranging from 
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1.9 to 4.3 times greater than WT. These designed proteins are listed in Table 7 as 
Cry 3Bb. 1 1 227, Cry3 Bb. 1 1 234, Cry3Bb. 11241, Cry3Bb. 1 1 242, and 
Cry3Bb. 11036. 

5 5.11 Example 11 - Design Method 5: Loop Design Around p Strands 
and p Sheets 

Loop regions of a protein structure may be involved in numerous functions 
of the protein including, but not limited to, channel formation, quaternary structure 
formation and maintenance, and receptor binding. A binding surface is often de- 
10 fined by a number of loops, as is the case with immunoglobulin G (IgG) (see 
Branden and Tooze, 1991, for review). What can not be determined at this point, 
however, is what loops will be important for receptor interactions just by looking at 
the structure of the protein in question. Since a receptor has not been identified for 
Cry3Bb, it is not even possible to compare the structure of Cry3Bb with other pro- 
teins that have the same receptor for structural similarities. To identify Cry3Bb 
loops that contribute to receptor interactions, random mutagenesis was performed 
on surface-exposed loops. 

As each loop was altered, the profile of the overall bioactivities of the resul- 
tant proteins were examined and compared. The loops, especially in domain 2 
which appears to be unnecessary for channel activity, fall into two categories: (1) 
loops that could be altered without much change in the level of bioactivity of the 
resultant proteins and (2) loops where alterations resulted in overall loss of resul- 
tant protein bioactivity. Using this design method, it is possible to identify several 
loops important for activity. 

5,11.1 Analysis of Loop p 2,3 

Semi-random mutagenesis of the loop region between P strands 2 and 3 re- 
sulted in the production of structurally stable toxin proteins with significantly re- 
duced activities against SCRW larvae. The ip2,3 region is highly sensitive to 
amino acid changes indicating that specific amino acids or amino acid sequences 
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are necessary for toxin protein activity. It is conceivable, therefore, that specific 
changes in the 102,3 region will increase the binding and, therefore, the activity of 
the redesigned toxin protein. 

5 5.11.2 ANALYSIS OF LOOP p 6,7 

Semi-random mutations introduced to the loop region between p strands 6 
and 7 resulted in structurally stable proteins with an overall loss of SCRW bioac- 
tivity. The lp6,7 region is highly sensitive to amino acid changes indicating that 
specific amino acids or amino acid sequences are necessary for toxin protein activ- 
10 ity. It is conceivable, therefore, that specific changes in the lp6,7 region will in- 
crease the binding and, therefore, the activity of the redesigned toxin protein. 

5.11.3 Analysis of Loop p 10,11 

Random mutations to the loop region between P strands 10 and 1 1 resulted 
15 in proteins having an overall loss of SCRW bioactivity. Loop pi 0,1 1 is structur- 
ally close to and interacts with loops p2,3 and £6,7. Specific changes to individual 
residues within the ipi0,l 1 region may also result in increased interaction with the 
insect membrane, increasing the bioactivity of the toxin protein. 

20 5,11.4 CRY3BB.11095 

Loops P2,3, £6,7 and pi 0,1 1 have been identified as important for bioac- 
tivity of Cry3Bb. The 3 loops are surface-exposed and structurally close together. 
Amino acid Q348 in the WT structure, located in p-strand 2 just prior to ip2,3, 
does not form any intramolecular contacts. However, replacing Q348 with ar- 
25 ginine (Q348R) results in the formation of 2 new hydrogen-bonds between R348 
and the backbone carbonyls of R487 and R488, both located in ip 10,11. The new 
hydrogen bonds may act to stabilize the structure formed by the 3 loops. The de- 
signed protein carrying this change, Cry3Bb.ll095, is 4.6-fold more active than 
WT Cry3Bb. 



30 
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5.12 Example 12 - Design Method 6: Identification and Re-design of 
Complex Electrostatic Surfaces 

Interactions of proteins include hydrophobic interactions (e.g., Van der 
Waals forces), hydrophilic interactions, including those between opposing charges 
5 on amino acid side chains (salt bridges), and hydrogen bonding. Very little is 
known about 8-endotoxin and receptor interactions. Currently, there are no litera- 
ture reports identifying the types of interactions that predominate between 
B. thuringiensis toxins and receptors. 

Experimentally, however, it is important to increase the strength of the 
10 B. thuringiensis toxin-receptor interaction and not permit the precise determination 
of the chemical interaction to stand in the way of improving it. To accomplish this, 
the electrostatic surface of Cry3Bb was defined by solving the Poisson-Boltzman 
distribution around the molecule. Once this electrically defined surface was 
solved, it could then be inspected for regions of greatest diversity. It was reasoned 
15 that these electrostatically diverse regions would have the greatest probability of 
participating in the specific interactions between the B. thuringiensis toxin proteins 
and the receptor, rather than more general and non-specific interactions. Therefore, 
these regions were chosen for redesign, continuing to increase the electrostatic di- 
versity of the regions. In addition, examination of the electrostatic interaction 
20 around the putative channel forming region of the toxin created insights for redes- 
ign. This includes identification of an electropositive residue in an otherwise 
negatively charged conduit (see example 4.6. 1). 

5.12.1 R290 (Cry3Bb.ll227, Cry3Bb.ll241, and Cry3Bb.ll242) 

25 Examination of the Cry3Bb dimer interface along the domain 1 axis sug- 

gested that a pore or conduit for cations might be formed between the monomers. 
Electrostatic examination of this axis lent additional credibility to this suggestion. 
In fact, the hypothetical conduit is primarily negatively charged, an observation 
consistent with the biophysical analysis of cation-selective, 5-endotoxin channels. 

30 If a cation channel were formed along the axis of the dimer, then the cation could 
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move between the monomers relatively easily with only one significant hurdle. A 
positively charged arginine residue (R290) lies in the otherwise negatively charged 
conduit. This residue could impede the cation movement through the channel. 
Based on this analysis, R290 was changed to uncharged residues. The bioactivity 
5 of redesigned proteins Cry3Bb.ll227 (R290N), Cry3Bb.ll241 (R290L) and 
Cry3Bb,11242 (R290V) was improved approximately 2-fold, 2.6-fold and 2.5-fold, 
respectively. 

5,12.2 CRY3BB.60 

10 Trypsin digestion of solubilized Cry3Bb yields a stable, truncated protein 

with a molecular weight of 60 kDa (Cry3Bb.60). Trypsin digestion occurs on the 
carboxyl side of residue R159, effectively removing helices 1 through 3 from the 
native Cry3Bb structure. The cleavage of the first 3 helices exposes an electro- 
static surface different than those found in the native structure. The new surface 

1 5 has a combination of hydrophobic, polar and charged characteristics that may play 
a role in membrane interactions. The bioactivity of Cry3Bb.60 is 3.6-fold greater 
than that of WT Ciy3Bb. 

5.13 Example 13 - Design Method 7: Identification and Removal of 
20 Metal Binding Sites 

The literature teaches that the in vitro behavior of B. thuringiensis toxins 
can be increased by chelating divalent cations from the experimental system 
(Crawford and Harvey 1988). It was not known, however, how these divalent 
cations inhibited the in vitro activity. Crawford and Harvey (1988) demonstrated 

25 that the short circuit cuiTent across the midgut was more severely inhibited by 
B. thuringiensis in the presence of EDTA, a chelator of divalent ions, than in the 
absence of this agent, thus suggesting that this step in the mode of action of 
B. thuringiensis could be potentiated by removing divalent ions. Similar observa- 
tions were made using black-lipid membranes and measuring an increase in the 

30 current created by the 5- endotoxins in the presence of EDTA to chelate divalent 
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ions. There were at least three possible explanations for these observations. The 
first explanation could be that the divalent ions are too large to move through a ion 
channel more suitable for monovalent ions, thereby blocking the channel. Second, 
the divalent ions may cover the protein in the very general way, thereby buffering 
5 the charge interactions required for toxin membrane interaction and limiting ion 
channel activity. The third possibility is that a specific metal binding site exists on 
the protein and, when occupied by divalent ions, the performance of the ion chan- 
nel is impaired. Although the literature could not differentiate the value of one 
possibility over another, the third possibility led to an analysis of the Cry3Bb 
10 structure searching for a specific metal binding site that might alter the probability 
that a toxin could form an ion channel. 

5.13.1 H231 (Cry3Bb, 11222, CRY3BB.11224, CRY3BB.11225, AND 
CRY3BB.11226) 

15 A putative metal binding site is formed in the Cry3Bb dimer structure by 

the H231 residues of each monomer. The H231 residues, located in helix ct6, lie 
adjacent to each other and close to the axis of symmetry of the dimer. Removal of 
this site by replacement of histidine with other amino acids was evaluated by the 
absence of EDTA-dependent ion channel activity. The bioactivities of the de- 

20 signed toxin proteins, Cry3Bb. 1 1 222, Cry3Bb.l 1224, Cry3Bb.ll225 and 
Cry3Bb.ll226, are increased 4-, 5-, 3.6- and 3-fold, respectively, over that of WT 
Cry3Bb. Their respective amino acid changes are listed in Table 2. 

5.14 Example 14 - Design Method 8: Alteration of Quaternary 
25 Structure 

Cry3Bb can exist in solution as a dimer similar to a related protein, Cry3A 
(Walters et ai, 1992). However, the importance of the dimer to biological activity 
is not known because the toxin as a monomer or as a higher order structure has not 
been seriously evaluated. It is assumed that specific amino acid residues contribute 
30 to the formation and stability of the quaternary structure. Once a contributing resi- 
due is identified, alterations can be made to diminish or enhance the effect of that 
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residue thereby affecting the interaction between monomers. Channel activity is a 
useful way, but by no means the only way, to assess quaternary structure of 
Cry3Bb and its derivatives. It has been observed that Cry3Bb creates gated con- 
ductances in membranes that grow in size with time, ultimately resulting in large 
5 pores in the membrane (the channel activity of WT Cry3Bb is described in Section 
12.1). It also has been observed that Cry3A forms a more stable dimer than 
Cry3Bb and coincidentally forms higher level conductances faster (FIG. 10). This 
observation led the inventors to propose that oligomerization and ion channel for- 
mation (conductance size and speed of channel formation) were related. Based on 

10 this observation Cry3Bb was re-engineered to make larger and more stable oli- 
gomers at a faster rate. It is assumed in this analysis that the rate of ion channel 
formation and growth mirrors this process. It is also possible that changes in qua- 
ternary structure may not affect channel activity alone or at all. Alterations to 
quaternary structure may also affect receptor interactions, protein processing in the 

15 insect gut environment, as well as other aspects of bioactivity unknown. 

5.14.1 CRY3BB.11048 

Comparative structural analysis of Cry3A and Cry3Bb led to the identifica- 
tion of structural differences between the two toxins in the ion channel-forming 

20 domain; specifically, an insertion of one amino acid between helix 2a and helix 2b 
in Cry3Bb. Removal of this additional amino acid in Cry 3B2, A 104, and a D103E 
substitution, as in Ciy3A, resulted in loss of channel gating and the formation of 
symmetrical pores. Once the pores are formed they remain open and allow a 
steady conductance ranging from 25-130 pS. This designed protein, 

25 Cry3Bb. 1 1048, is 4.3 times more active than WT Cry3Bb against SCRW larvae. 

5.14.2 Oligomerization of Cry3Bb.60 

Individual molecules of Cry3Bb or Cry3Bb.60 can form a complex with 
another like molecule. Oligomerization of Cry3Bb is demonstrated by SDS-PAGE, 
30 where samples are not heated in sample buffer prior to loading on the gel. The lack 
of heat treatment allows some nondenatured toxin to remain. Oligomerization is 
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visualized following Coomassie staining by the appearance of a band at 2 times the 
molecular weight of the monomer. The intensity of the higher molecular weight 
band reflects the degree of oligomerization. The ability of Cry3Bb to form an oli- 
gomer is not reproducibly apparent. The complex cannot be repeatedly observed to 
5 form. Cry3Bb.60, however, forms a significantly greater amount of a higher mo- 
lecular weight complex (120 kDa). These data suggest that Cry3Bb.60 more read- 
ily forms the higher order complex than Cry3Bb alone. Cry3Bb.60 also forms ion 
channels with greater frequency than WT Cry3Bb (see Section 5. 12.9). 

10 5.14.3 CRY3BB.11035 

Changes were made in Cry3Bb to reflect the amino acid sequence in Cry3A 
at the end of la3,4 and in the beginning of helix 4. These changes resulted in the 
designed protein, Cry3Bb.l 1035, that, unlike wild type Cry3Bb, forms spontane- 
ous channels with large conductances. Cry3Bb.l 1035 is also approximately three 
15 times more active against SCRW larvae than WT Cry3Bb. Cry3Bb.l 1035 and its 
amino acid changes are listed in Table 10. 

5.14.4 Cry3Bb.11032 

Cry3Bb.l 1032 was altered at residue 165 in helix a4, changing an asparate 
20 to glycine, as found in Cry3A. Cry3Bb.ll032 is three-fold more active than WT 
Cry3Bb. The channel activity of Cry3Bb.l 1032 is much like Cry3Bb except when 
the designed protein is artificially incorporated into the membrane. A 16-fold in- 
crease in the initial channel conductances is observed compared to WT Cry3Bb 
(see Section 5.12.2). This increase in initial conductance presumably is due to en- 
25 hanced quaternary structure, stability or higher-order structure. 

5.14.5 EG11224 

In the WT Cry3Bb dimer structure, histidine, at position 231 in domain 1, 
makes hydrogen bond contacts with D288 (domain 1), Y230 (domain 1), and, 
30 through a network of water molecules, also makes contacts to D61 0 (domain 3), all 
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of the opposite monomer. D610 and K235 (domain 1) also make contact Replac- 
ing the histidine with an arginine, H231R, results, in one orientation, in the forma- 
tion of a salt bridge to D610 of the neighboring monomer. In a second orientation, 
the contacts with D288 of the neighboring monomer, as appear in the WT struc- 
5 ture, are retained. In either orientation, R231 does not hydrogen bond to Y230 of 
the opposite monomer but does make contact with K235 which retains is contacts 
to K610 (V. Cody, research communication). The shifting hydrogen bonds have 
changed the interactions between the different domains of the protein in the qua- 
ternary structure. Overall, fewer hydrogen bonds exist between domains 1 of the 

10 neighboring monomers and a much stronger bond has been formed between do- 
mains 1 and 3. Channel activity was found to be altered. Cry3Bb. 11224 produces 
small, quickly gating channels like Cry3Bb. However, unlike WT Cry3Bb, 
Cry 3Bb. 11224 does not exhibit (i-mercaptoethanol-dependent activation. Replac- 
ing H231 with arginine resulted in a designed Cry3Bb protein, Cry3Bb. 11224, ex- 

1 5 hibiting a 5-fold increase in bioactivity. 

5.14.6 CRY3BB.11226 

Cry3Bb.ll226 is similar to Cry3Bb.ll224, discussed in Section 4.8.5, in 
that the histidine at position 231 has been replaced. The amino acid change, 
20 H231T, results in the loss of P-mercaptoethanol dependent activation seen with 
WT Cry3Bb (see Section 5.12.1). The replacement of H231, a putative metal 
binding site, changes the interaction of regions in the quaternary structure resulting 
in a different type of channel activity. Cry 3Bb. 11226 is three-fold more active 
than WT Ciy3Bb. 

25 

5.14.7 CRY3BB.11221 

Cry3Bb. 11221 has been re-designed in the la3,4 region of Cry3Bb. The 
channels formed by Cry3Bb.l 1221 are much more well resolved than the conduc- 
tances formed by WT Cry3Bb (see Section 5.12.6). Cry3Bb.l 1221 exhibits a 6.4- 
30 fold increase in bioactivity over that of WT Cry3Bb. The amino acid changes 
found in Cry3Bb.l 1221 are listed in Table 2. 
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5.14.8 CRY3BB.11242 

The designed protein, Cry3Bb.l 1242, carrying the alteration R290V, forms 
small conductances immediately which grow rapidly and steadily to large conduc- 
5 tances in about 3 min (see Section 5.12.7). This is contrast to WT Cry3Bb chan- 
nels which take 30-45 min to appear and grow slowly over hours to large conduc- 
tances. Cry3Bb.l 1242 also exhibits a 2.5-fold increase in bioactivity compared to 
WT Cry3Bb. 

10 5.14.9 Cry3Bb.11230 

Cry3Bb.l 1230, unlike WT Cry3Bb, forms well resolved channels with long 
open states. These channels reach a maximum conductance of 3000 pS but do not 
continue to grow with time. Cry3Bb.l 1230 has been re-designed in the l(51,cc8 re- 
gion of Cry3Bb and exhibits almost a 5-fold increase in activity against SCRW 
15 larvae (Table 9) and a 5.4-fold increase against WCRW larvae (Table 10) com- 
pared to WT Cry3Bb. The amino acid changes found in Cry3Bb.l 1230 are listed 
in Table 2. 

5.15 Example 15 - Design Method 9: Design of Structural Residues 

20 The specific three-dimensional structure of a protein is held in place by 

amino acids that may be buried or otherwise removed from the surface of the pro- 
tein. These structural determinants can be identified by inspection of forces re- 
sponsible for the surface structure positioning. The impact of these structural resi- 
dues can then be enhanced to restrict molecular motion or diminished to enhance 

25 molecular flexibility. 

5.15.1 CRY3BB.11095 

Loops £2,3, (J6,7 and pi0,ll, located in domain 2 of Cry3Bb, have been 
identified as important for bioactivity. The three loops are surface-exposed and 
30 structurally close together. Amino acid Q348 in the WT structure, located in P- 
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strand 2 just prior to ip2,3, does not form any intramolecular contacts. However, 
replacing Q348 with arginine (Q348R) results in the formation of 2 new hydrogen- 
bonds between R348 and the backbone carbonyls of R487 and R488, both located 
in lpiOJ 1 . The new hydrogen bonds may act to stabilize the structure formed by 
5 the three loops. Certainly, the structure around R348 is more tightly packed as de- 
termined by X-ray crystallography. The designed protein carrying this change, 
Cry3Bb.l 1095, is 4.6-fold more active than WT Cry3Bb. 

5.16 Example 16 - Design Method 10: Combinatorial Analysis and 
10 Mutagenesis 

Individual sites in the engineered Cry3Bb molecule can be used together to 
create a Cry3Bb molecule with activity even greater than the activity of any one 
site. This method has not been precisely applied to any 5-endotoxin. It is also not 
obvious that improvements in two sites can be pulled together to improve the bio- 

15 logical activity of the protein. In fact, data demonstrates that improvements to 2 
sites, when pulled together into a single construct, do not necessarily further im- 
prove the biological activity of Cry3Bb. In some cases, the combination resulted 
in decreased protein stability and/or activity. Examples of proteins with site com- 
binations that resulted in improved activity compared to WT Cry3Bb but decreased 

20 activity compared to 1 or more of the "parental" proteins are Cry3Bb.ll235, 
1 1046, 1 1057 and 1 1058. Cry3Bb.ll082, which contains designed regions from 4 
parental proteins, retains the level of activity from the most active parental strain 
(Cry3Bb.l 1230) but does not show an increase in activity. These proteins are listed 
in Table 7. The following are examples of instances where combined mutations 

25 have significantly improved biological activity. 

5.16.1 Cry3BbJ1231 

Designed protein Cry3Bb. 11231 contains the alterations found in 
Cry3Bb.ll224 (H231R) and Cry3Bb.ll228 (changes in ipi,cc8). The combination 
30 of amino acid changes found in Cry3Bb.l 1231 results in an increase in bioactivity 
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against SCRW larvae of approximately 8-fold over that of WT Cry3Bb (Table 2). 
This increase is greater than exhibited by either Cry3Bb. 11224 (5.0x) or 
Cry3Bb.H228 (4.1x) alone. Cry3Bb.ll231 was also exhibits an 12.9-fold in- 
crease in activity compared to WT Cry3Bb against WCRW larvae (Table 10). 

5 

5.16.2 CRY3BB.11081 

Designed Cry3Bb protein Cry3Bb.l 1081 was constructed by combining the 
changes found in Cry3Bb.ll032 and Cry3Bb.ll229 (with the exception of 
Y318C). Cry3Bb.ll081 a 6.1 -fold increase in activity over WT Cry3Bb; a greater 
10 increase in activity than either of the individual parental proteins, Cry3Bb.l 1032 
(3.1-fold) and Cry3Bb.l 1229 (2.5-fold). 

5.163 CRY3BB.11083 

Designed Cry3Bb protein Cry3Bb.H083 was constructed by combining the 
15 changes found in Cry3Bb.ll036 and Cry3Bb.l 1095. Cry3Bb.ll083 exhibits a 
7.4-fold increase in activity against SCRW larvae compared to WT Cry3Bb; a 
greater increase than either Cry3Bb.ll036 (4.3x) or Cry3Bb.ll095 (4.6x). 
Cry3Bb. 11083 also exhibits a 5.4-fold increase in activity against WCRW larvae 
compared to WT Cry3Bb (Table 10). 

20 

5.16.4 CRY3BB.11084 

Designed Cry3Bb protein Cry3Bb.l 1084 was constructed by combining the 
changes found in Cry3Bb.ll032 and the S311L change found in Cry3Bb.ll228. 
Cry3Bb.ll084 exhibits a 7.2-fold increase in activity over that of WT Cry3Bb; a 
25 greater than either Cry3Bb.ll032 (3.1 x) or Ciy3Bb. 11228 (4.1x). 

5.16.5 CRY3BB.11098 

Designed Cry3Bb protein Cry 3Bb. 11098 was constructed to contain the 
following amino acid changes: D165G, H231R, S311L, N3I3T, and E317K. The 
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nucleic acid sequence is given in SEQ ID NO: 107, and the encoded amino acid se- 
quence is given in SEQ ID NO: 108. 

5J7 Example 17 - Design Strategy 11: Alteration of Binding to 
5 Glycoproteins and to WCRW Brush Border Membranes 

While the identity of receptor(s) for Cry3Bb is unknown, it is nonetheless 
important to increase the interaction of the toxin with its receptor. One way to im- 
prove the toxin-receptor interaction with knowing the identity of the receptor is to 
reduce or eliminate non-productive binding to other biomolecules. The inventors 

10 have observed that Cry3Bb binds non-specifically to bovine serum albumin (BSA) 
that has been glycosylated with a variety of sugar groups, but not to 
non-glycosylated BSA. Cry3A, which is not active on Diabrotica species, shows 
similar but even greater binding to glycosylated-BSA. Similarly, Cry3A shows 
greater binding to immobolized WCRW brush border membrane (BBM) than does 

15 WT Cry3Bb, suggesting that much of the observed binding is non-productive. It 
was reasoned that the non-specific binding to WCRW BBM occurs via glycosy- 
lated proteins, and that binding to both glycosylated-BSA and WCRW BBM is 
non-productive in reaction pathway to toxicity. Therefore reduction or elimination 
of that binding would lead to enhanced binding to the productive receptor and to 

20 enhanced toxicity. Potential binding sites for sugar groups were targeted for redes- 
ign to reduce the non-specific binding of Cry3Bb to glycoproteins and to immobi- 
lized WCRW BBM. 

5.17.1 Cry3Bb.60 

25 Cry3Bb-60, in which Cry3Bb has been cleaved at R159 in la3,4, shows 

decreased binding to glycosylated-BSA and decreased binding to immobilized 
WCRW BBM. Cry3Bb-60 shows a 3.6-fold increase in bioactivity relative to WT 
Cry3Bb. 
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5.17.2 Alterations to 1<x3,4 (Cry3Bb.11221) 

Cry3Bb.l 1221 has been redesigned in the loc3,4 region of domain 1, which 
is the region in which Cry3Bb is cleaved to produce Cry3Bb-60. Cry3Bb.l 1221 
also shows decreased binding to both glycosylated-BSA and immobilized WCRW 
5 BBM, and exhibits a 6,4-fold increase in bioactivity over that of WT Cry3Bb. To- 
gether with data for Cry3Bb.60 (section 5.17.1) these data suggest that this loop 
region contributes substantially to non-productive binding of the toxin. 

5.173 Alteration TO ipi,a8 (Cry3Bb.1 1228,11230,1 1237 and 11231) 

1 0 The 1 pi ,<x8 region of Cry3Bb has been re-engineered to increase hydration 

(section 4.2.4) and enhance flexibility (section 4.4.3). Several proteins altered in 
this region, Cry3Bb.l 1228,1 1230, and 11237 demonstrate substantially lower lev- 
els of binding both glycosylated-BSA and immobilized WCRW BBM, and also 
show between 4.1- and 4.5-fold increases in bioactivity relative to WT Cry3Bb. 

15 

5.17.4 Binding Activity 

The tendencies of Cry3Bb and some of its derivatives to bind to glycosy- 
lated-BSA and to WCRW BBM were determined using a BIAcore™ surface plas- 
mon resonance biosensor. For glycosylated-BSA binding, the glycosylated protein 

20 was immobilized using standard NHS chemistry to a CMS chip (BIAcore), and the 
solubilized toxin was injected over the glycosylated-BSA surface. To measure 
binding to WCRW BBM, brush border membrane vesicles (BBMV) purified from 
WCRW midguts (English et ai, 1991) were immobilized on an HP A chip 
(BIAcore) then washed with either lOmM KOH or with 40mM p-octylglucoside. 

25 The solubilized toxin was then injected over the resulting hybrid bilayer surface to 
detect binding. Protein concentration were determined by Protein Dye Reagent 
assay (BioRad) or BCA Protein Assay (Pierce). 

Other methods may also be used to determine the same binding informa- 
tion. These include, but are not limited to, ligand blot experiments using labeled 
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toxin, labeled glycosylated protein, or anti-toxin antibodies, affinity chromatogra- 
phy, and in vitro binding of toxin to intact BBMV. 

5.18 Example 18 - Construction of Plasmids With WT cry3Bb 
5 Sequences 

Standard recombinant DNA procedures were performed essentially as de- 
scribed by Sambrook et a/., (1989). 

5.18.1 PEG1701 

10 pEGI701 (FIG. 1 1), contained in EG1 1204 and EG1 1037, was constructed 

by inserting the Sph\-Pst\ fragment containing the cry3Bb gene and the cry IF 
terminator from pEG91 1 (Baum, 1994) into the Sphh Pstl site of pEG854.9 (Baum 
et ah, 1996), a high copy number B. thuringiensis - E coli shuttle vector. 

15 5.18.2 PEG1028 

pEG1028 contains the HindlH fragment of cry3Bb from pEG1701 cloned 
into the multiple cloning site of pTZ18U at i/iwdlll. 

5.19 Example 19 - Construction of Plasmids with Altered cry3Bb 
20 Genes 

Plasmid DNA from E. coli was prepared by the alkaline lysis method 
(Maniatis et al, 1982) or by commercial plasmid preparation kits (examples: PER- 
FECTprep™ kit, 5 Prime - 3 Prime, Inc., Boulder CO; QIAGEN plasmid prep kit, 
QIAGEN Inc.). B thuringiensis plasmids were prepared from cultures grown in 

25 brain heart infusion plus 0.5% glycerol (BHIG) to mid logarithmic phase by the 
alkaline lysis method. When necessary for purification, DNA fragments were ex- 
cised from an agarose gel following electrophoresis and recovered by glass milk 
using a Geneclean II® kit (BIO 101 Inc., La Jolla, CA). Alteration of the cry3Bb 
gene was accomplished using several techniques including site-directed 

30 mutagenesis, triplex PCR™, quasi-random PCR™ mutagenesis, DNA shuffling 
and standard recombinant techniques. These techniques are described in Sections 
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6.1, 6.2, 6.3, 6.4 and 6.5, respectively. The DNA sequences of primers used are 
listed in Section 7. 

5.20 Example 20 « Site-Directed Mutagenesis 

5 Site-directed mutagenesis was conducted by the protocols established by 

Kunkle (1985) and Kunkle et al (1987) using the Muta-Gene™ M13 in vitro 
mutagenesis kit (Bio-Rad, Richmond, CA). Combinations of alterations to cry3Bb 
were accomplished by using the Muta-Gene™ kit and multiple mutagenic oligonu- 
cleotide primers. 

10 

5.20.1 PEG1041 

pEG1041, contained in EG 11032, was constructed using the Muta-Gene™ 
kit, primer C, and single-stranded pEG1028 as the DNA template. The resulting 
altered cry3Bb DNA sequence was excised as a PflMl DNA fragment and used to 
1 5 replace the corresponding DNA fragment in pEGl 701 . 

5.20.2 PEG1046 

pEG1046, contained in EG11035, was constructed using the Muta-Gene™ 
kit, primer D, and single-stranded pEG1028 as the DNA template. The resulting 
20 altered cry3Bb DNA sequence was excised as a PflMl DNA fragment and used to 
replace the corresponding DNA fragment in pEG1701 . 

5.20.3 PEG1047 

pEG1047, contained in EG11036, was constructed using the Muta-Gene™ 
25 kit, primer E, and single-stranded pEG1028 as the DNA template. The resulting 
altered cry3Bb DNA sequence was excised as a PflMl DNA fragment and used to 
replace the corresponding DNA fragment in pEG1701 . 
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5.20.4 PEG1052 

pEG1052, contained in EG 11046, was constructed using the Muta-Gene™ 
kit, primers D and E, and single-stranded pEG1028 as the DNA template. The re- 
sulting altered cry3Bb DNA sequence was excised as a PJIMl DNA fragment and 
5 used to replace the corresponding DNA fragment in pEGl 701 . 

5.20.5 PEG1054 

pEG1054, contained in EG 11048, was constructed using the Muta-Gene™ 
kit, primer F, and single-stranded pEG1028 as the DNA template. The resulting 
10 altered cry3Bb DNA sequence was excised as a P/7MI DNA fragment and used to 
replace the corresponding DNA fragment in pEG1701 . 

5.20.6 PEG 1057 

pEG1057, contained in EG 11 051, was constructed using the Muta-Gene™ 
15 kit, primer G, and single-stranded pEG1028 as the DNA template. The resulting 
altered cry3Bb DNA sequence was excised as a PflMl DNA fragment and used to 
replace the corresponding DNA fragment in pEG] 701 . 

5.21 Example 21 - Triplex PCR™ 

20 Triplex PCR™ is described by Michael (1994). This method makes use of 

a thermostable ligase to incorporate a phosphorylated mutagenic primer into an 
amplified DNA fragment during PCR™. PCR™ was performed on a Perkin Elmer 
Cetus DNA Thermal Cycler (Perkin-Elmer, Norwalk, CT) using a AmpliTaq™ 
DNA polymerase kit (Perkin-Elmer) and S/?M-linearized pEG1701 as the template 

25 DNA. PCR™ products were cleaned using commercial kits such as Wizard™ 
PCR™ Preps (Promega, Madison, WI) and QIAquick PCR™ Purification kit 
(QIAGEN Inc., Chatsworth, CA). 
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5,21.1 PEG1708 AND PEG1709 

pEG1708 and pEG1709, contained in EG11222 and EG11223, respec- 
tively, were constructed by replacing the PflM-PflM fragment of cry3Bb in 
pEG1701 with P/7M-digested and gel purified PCR™ fragment altered at cry3Bb 
5 nucleotide positions 688-690, encoding amino acid Y230. Random mutations were 
introduced into the Y230 codon by triplex PCR™. Mutagenic primer MVT095 
was phosphorylated and used together with outside primer pair FW001 and 
FW006. Primer MVT095 also contains a silent mutation at position 687, changing 
T to C, which, upon incorporation, introduces an additional EcoRl site into 
10 pEG1701. 



5.21.2 PEG1710, PEG1711 and PEG1712 

Plasmids pEG1710, pEG1711 and pEG1712, contained in EG11224, 
EG11225 and EG11226, respectively, were created by replacing the PflM-PflM 

15 fragment of the cry3Bb gene in pEG1701 with ^y7M~digested and gel purified 
PCR™ fragment altered at cry3Bb nucleotide positions 690-692, encoding H231. 
Random mutations were introduced into the H231 codon by triplex PCR™. 
Mutagenic primer MVT097 was phosphorylated and used together with outside 
primer pair FW001 and FW006. Primer MVT097 also contains a T to C sequence 

20 change at position 687 which, upon incorporation, results in an additional EcoKl 
site by silent mutation. 



5.21.3 PEG1713 and PEG1727 

pEG1713 and pEG1727, contained in EG11227 and EG11242, respec- 
25 tively, were constructed by replacing the PflM-PflM fragment of the cry3Bb gene 
in pEG1701 with P/7M-digested and gel purified PCR™ fragment altered at 
cry3Bb nucleotide positions 868-870, encoding amino acid R290. Triplex PCR™ 
was used to introduce random changes into the R290 codon. The mutagenic 
primer, MVT091, was designed so that the nucleotide substitutions would result in 
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approximately 36% of the sequences encoding amino acids D or E. MVT091 was 
phosphorylated and used together with outside primer pair FW001 and FW006. 

5.22 Example 22 - Quasi-Random PCR™ Mutagenesis 

5 Quasi-random mutagenesis combines the mutagenic PCR™ techniques de- 

scribed by Vallette et aL (1989), Tomic et al (1990) and LaBean and Kauffman 
(1993). Mutagenic primers, sometimes over 70 nucleotides in length, were de- 
signed to introduce changes over nucleotide positions encoding for an entire struc- 
tural region, such as a loop. Degenerate codons typically consisted of a ratio of 

10 82% WT nucleotide plus 6% each of the other 3 nucleotides per position to semi- 
randomly introduce changes over the target region (LaBean and Kauffman, 1993). 
When possible, natural restriction sites were utilized; class 2s enzymes were used 
when natural sites were not convenient (Stemmer and Morris, 1992, list additional 
restriction enzymes useful to this technique). PCR™ was performed on a Perkin 

15 Elmer Cetus DNA Thermal Cycler (Perkin-Elmer, Norwalk, CT) using a Ampli- 
Taq™ DNA polymerase kit (Perkin-Elmer) and S)?/7l-linearized pEG1701 as the 
template DNA. Quasi-random PCR™ amplification was performed using the fol- 
lowing conditions: denaturation at 94°C for 1.5 min.; annealing at 50°C for 2 min. 
and extension at 72°C for 3 min., for 30 cycles. The final 14 extension cycles were 

20 extended an additional 25 s per cycle. Primers concentration was 20 \iM per reac- 
tion or 40 vM for long, mutagenic primers. PCR™ products were cleaned using 
commercial kits such as Wizard™ PCR™ Preps (Promega, Madison, WI) and 
QIAquick PCR™ Purification kit (QIAGEN Inc., Chatsworth, CA). In some in- 
stances PCR™ products were treated with Klenow Fragment (Promega) following 

25 the manufacturer's instructions to fill in any single base overhangs prior to restric- 
tion digestion. 

5.22.1 PEG1707 

EG1707, contained in EG11221, was constructed by replacing the PflM- 
30 PflM fragment of the cry3Bb gene in pEG1701 with /yjM-digested and gel puri- 
fied PCR™ fragment altered at cry3Bb nucleotide positions 460-480, encoding 
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la3,4 amino acids 154-160. Primer MVT075, which includes a recognition site for 
the class 2s restriction enzyme Bsal, and primer FW006 were used to introduce 
changes into this region by quasi-random mutagenesis. Primers MVT076, also 
containing a Bsal site, and primer FW001 were used to PCR™ amplify a "linker" 
5 fragment. Following PCR™ amplification, both products were cleaned, end-filled, 
digested with Bsal and ligated to each other. Ligated fragment was gel purified 
and used as template for PCR™ amplification using primer pair FW001 and 
FW006. PCR™ product was cleaned, digested with PflMI, gel purified and ligated 
into iy?MI-digested and purified pEG1701 vector DNA. 

10 

5.22.2 pEG1720andpEG1726 

pEG1720 and pEG1726, contained in EG11234 and EG11241, respec- 
tively, were constructed by replacing the PflM-PflM fragment of the cry3Bb gene 
in pEG1701 with /yZA/I-digested and gel purified PCR™ fragment altered at 

15 cry3Bb nucleotide positions 859-885, encoding la7,pl amino acids 287-295. 
Quasi-random PCR™ mutagenesis was used to introduce changes into this region. 
Mutagenic primer MVT111, designed with a Bsal site, and primer FW006 were 
used to introduce the changes. Primer pair MVT094, also containing a Bsal site, 
and FW001 were used to amplify the linker fragment. The PCR™ products were 

20 digested with Bsal, gel purified then ligated to each other. Ligated product was 
PCR™ amplified using primer pair FW001 and FW006, digested with PflMI 

5.22.3 PEG1714, PEG1715, PEG1716, PEG1718, PEG1719, PEG1722, 
PEG1723, PEG1724 AND PEG1725 

25 pEG1714, pEG1715, pEG1716, pEG1718, pEG1719 5 pEG1722, pEG1723, 

pEG1724 and pEG1725, contained in EG11228, EG11229, EG11230, EG11232, 
EG11233, EG11236, EG11237, EG11238 and EG11239, respectively, were con- 
structed by replacing the PflM-PflM fragment of the cry3Bb gene in pEG1701 
with iy/M-digested and gel purified PCR™ fragment altered at cry3Bb nucleotide 

30 positions 931-954, encoding l(Jl,a8 amino acids 311-318. Quasi-random PCR™ 
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mutagenesis was used to introduce changes into this region using mutagenic primer 
MVT103 and primer FW006. Primers FW001 and FW006 were used to amplify a 
linker fragment. The PCR™ products were end-filled using Klenow and digested 
with BamUl. The larger fragment from the FW001-FW006 digest was gel purified 
5 then ligated to the digested MVT103-FW006 fragment. Ligated product was gel 
purified and amplified by PCR™ using primer pair FW001 and FW006. The am- 
plified product was digested with PflMl and gel purified prior to ligation into 
P/7MI-digested and purified pEGl 701 vector DNA. 

10 5.22,4 PEG1701.LP2.3 

Plasmids carrying alterations of cry3Bb WT sequence at nucleotides 1051- 
1065, encoding structural region lp2,3 of Cry3Bb ? were constructed by replacing 
the Mlul-Spel fragment of pEG1701 with isolated Mlul- and Spel-digested PCR™ 
product. The PCR™ product was generated by quasi-random PCR™ mutagenesis 
15 were mutagenic primer MVT08I was paired with FW006. These plasmids as a 
group are designated pEG1701 .ip2,3. 

5.22.5 pEG1701.Lp6,7 

Plasmids containing mutations of the cry3Bb WT sequence at nucleotides 
20 1234-1248, encoding structural region lp6,7 of Cry3Bb, were constructed by re- 
placing the Mlul-Spel fragment of pEG1701 with isolated Mlu\- and SJ^I-digested 
PCR™ product. The PCR™ product was generated by quasi-random PCR™ 
mutagenesis where mutagenic primer MVT085 was paired with primer WD115. 
Primer pair MVT089 and WD1 12 were used to amplify a linker fragment. Both 
25 PCR™ products were digested with Taq\ and ligated to each other. The ligation 
product was gel purified and PCR™ amplified using primer pair MVT089 and 
FW006. The amplified product was digested with MIul and Spel and ligated into 
Mlul and Spel digested and purified pEG1701 vector DNA. These plasmids as a 
group are designated pEG1701.ip6,7. 
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5.22.6 PEG1701.LplO,H 

Plasmids containing mutated cry3Bb sequences at nucleotides 1450-1467, 
encoding structural region 1(310,11 of Cry3Bb, were constructed by replacing the 
Spel-Pstl fragment of pEOl 701 with isolated Spel- and ft/I-digested PCR™ prod- 
5 uct. The PCR™ product was generated by quasi-random PCR™ mutagenesis 
where mutagenic primer MVT105 was paired with primer MVT070. Primer pair 
MVT092 and MVT083 were used to generate a linker fragment. (MVT083 is a 
mutagenic oligo designed for another region. The sequence changes introduced by 
MVT083 are removed following restriction digestion and do not impact the altera- 
10 tion of cry3Bb in the ip 10,1 1 region.) Both PCR™ products were digested with 
Bsal, ligated together, and the ligation product PCR™ amplified with primer pair 
MVT083 and MVT070. The resulting PCR™ product was digested with Spel and 
Pstl, and gel purified. These plasmids as a group are designated pEG 1 70 1 .10 1 0, 1 1 . 

1 5 5.23 Example 23 - DNA Shuffling 

DNA-shuffling, as described by Stemmer (1994), was used to combine in- 
dividual alterations in the cry3Bb gene. 

5.23.1 PEG1084, PEG1085, PEG1086 and PEG1087 

20 pEG1084, pEG1085, pEG1086, and pEG1087, contained in EG11081, 

EG11082, EG11083, and EG11084, respectively, were recovered from DNA- 
shuffling. Briefly, PflMl DNA fragments were generated using primer set A and B 
and each of the plasmids pEG1707, pEG1714, pEG1715, pEG1716, pEG1041, 
pEG1046, pEG1047, and pEG1054 as DNA templates. The resulting DNA frag- 

25 ments were pooled in equal-molar amounts and digested with DNasel and 50-100 
bp DNA fragments were recovered from an agarose gel by three successive freeze- 
thaw cycles: three min in a dry-ice ethanol bath followed by complete thawing at 
50°C. The recovered DNA fragments were assembled by primerless-PCR™ and 
PCRTM-amplified t h e primer set A and B as described by Stemmer (1994). 
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The final PCR™-araplificd DNA fragments were cut with P/Ml and used to re- 
place the corresponding cry3Bb PfMl DNA fragment in pEG1701. 

5.24 Example 24 - Recombinant DNA Techniques 

5 Standard recombinant DNA procedures were performed essentially as de- 

scribed by Sambrook et ai (1989). 

5.24.1 PEG1717 

pEG1717, contained in EG11231, was constructed by replacing the small 
10 Bglll fragment of pEG1710 with the small Bglll fragment from pEG1714. 

5.24.2 PEG1721 

pEG1721, contained in EG11235, was constructed by replacing the small 
Bglll fragment from pEG1710 with the small BgR\ fragment from pEG1087. 



15 



20 



25 



5.24.3 PEG 1063 

pEG1062, contained in EG11057, was constructed by replacing the Nco\ 
DNA fragment containing ori 43 from pEG1054 with the isolated hi col DNA 
fragment containing ori 43 and the alterations in cry3Bb from pEG1046. 

5.24.4 PEG1063 

pEG1063, contained in EG11058, was constructed by replacing the Ncol 
DNA fragment containing ori 43 from pEG1054 with the isolated Ncol DNA 
fragment containing ori 43 and the alterations in cry3Bb from pEGl 707. 

5.24.5 PEG1095 

pEG1095, contained in EG11095, was constructed by replacing the Mlul- 
Spe\ DNA fragment in pEG1701 with the corresponding MJul-Spel DNA fragment 
from pEG1086. 
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5.25 Example 25 - Primers Utilized in Constructing Cry3Bb* 
Variants 

Shown below are the primers used for site-directed mutagenesis, triplex 
PCR™ and quasi-random PCR™ to prepare the cry3Bb* variants as described 
5 above. Primers were obtained from Ransom Hill Bioscience, Inc. (Ramona, CA) 
and Integrated DNA Technologies, Inc. (Coralville, IA). The specific composition 
of the primers containing particular degeneracies at one or more residues is given 
in Section 5.30, Example 30. 

1 0 5.25. 1 Primer FW00 1 (SEQ ID NO:71): 

5'-AGACAACTCTACAGTAAAAGATG-3' 

5.25.2 Primer FW006 (SEQ ID NO:72): 
5'-GGTAATTGGTCAATAGAATC-3' 

15 

5.25.3 Primer MVT095 (SEQ ID NO:73): 

5'-CAGAAGATGTTGCTGAATTCNNNCATAGACAATTAAAAC-3' 

5.25.4 Primer MVT097 (SEQ ID NO:74): 

20 5'-GATGTTGCTGAATTCTATNNNAGACAATTAAAAC-3' 

5.25.5 Primer MVT091 (SEQ ID NO:75): 
5'-CCCATTTTATGATATTBDNTTATACTCAAAAGG-3' 



25 5.25.6 Primer MVT075 (SEQ ID NO:76): 

5'- 

AGCTATGCTGGTCTCGGAAGAAAEFNFFNFJNJFJFJNFINJFJAAAAGAAG 
CCAAGATCGAAT-3 ' 
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5.25.7 PRIMER MVT076 (SEQ ID NO:77): 
5'-GGTCACCTAGGTCTCTCTTCCAGGAATTTAACGCATTAAC-3' 

5.25.8 Primer MVT1 11 (SEQ ID NO:78): 

5 5'- 

AGCTATGCTGGTCTCCCATTTJEHIEJEJJEIIKRRJEHEIJEENIIIGTTAAAAC 
AGAACTAAC-3 ' 

5.25.9 Primer MVT094 (SEQ ID NO:79): 

1 0 5'-ATCCAGTGGGGTCTCAAATGGGAAAAGTACAATTAG-3' 

5.25.10 Primer MVT103 (SEQ ID NO:80): 
5'- 

CATTTTTACGGATCCAATTTTTJFFFJNEEJEFNFJNFEILEIJEOGGACCAAC 
15 TTTTTTGAG-3' 

5.25.11 Primer MVT081 (SEQ ID NO:81): 

5'- 

GAATTTCATACGCGTCTTCAACCTGGTJEHJJJIINMEEIEJTCTTTCAATTA 
20 TTGGTCTGG-3' 

5.25.12 Primer MVT085 (SEQ ID NO:82): 
5'- 

AAAAGTTTATCGAACTATAGCTAATACAGACGTAGCGGCTJQQFFNEEJII 
25 JEEIGTATATTTAGGTGTTACG-3' 



5.25.13 Primer A (SEQ ID NO:83) 3b2pflm1 : 
5'-GGAGTTCCATTTGCTGGGGC-3' 
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5.25.14 



Primer B (SEQ ID NO.-84) 3b2pflm2: 



5'-ATCTCCATAAAATGGGG-3' 



5.25.15 



Primer C (SEQ ID NO:85) 3B2165DG: 



5 



5'-GCGAAGTAAAAGAAGCCAAGGTCGAATAAGGG-3' 



5.25.16 



Primer D (SEQ ID NO:86) 3B2160SKRD: 



5'- 



CCTTTAAGTTTGCGAAATCCACACAGCCAAGGTCGAATAAGGG-3' 

10 

5.25.17 Primer E (SEQ ID NO:87) 3b2290VP: 
5'-CCCATTTTATGATGTTCGGTTATACCCAAAAGGGG-3' 

5.25.18 Primer F (SEQ ID NO:88) 3b2EdA104: 

15 5'-GGCCAAGTGAAGACCCATGGAAGGC-3' 

5.25.19 Primer G (SEQ ID NO:89) 3b2KG1 89: 

5'-GCAGTTTCCGGATTCGAAGTGC-3' 

20 5.25.20 Primer WD1 12 (SEQ ID NO:90): 
5'-CCGCTACGTCTGTATTA-3' 

5.25.21 Primer WD1 15 (SEQ ID NO.-91): 

5'-ATAATGGAAGCACCTGA-3' 

25 

5.25.22 Primer MVT105 (SEQ ID NO:92): 
5'- 

AGCTATGCTGGTCTCTTCTTAEJIFEIIEFFIJFU11NACAATTCCATTTTTTAC 
TTGG-3' 
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5.25.23 Primer MVT092 (SEQ ID NO:93): 

5'-ATCCAGTTGGGTCTCTAAGAAACAAACCGCGTAATTAAGC-3' 

5 5.25.24 Primer MVT070 (SEQ ID NO:94): 

5'-CCTCAAGGGTTATAACATCC-3' 



5.25.25 Primer MVT083 (SEQ ID NO:95): 
5'- 

1 0 GTACAAAAGCTAAGCTTTIEJIINPEEMEEIJNJESCGA ACTATAGCTAATA 
CAG-3' 



5.26 Example 26 - Sequence Analysis of Altered cry3Bb Genes 

E. coli DH5a™ (GIBCO BRL, Gaithersburg, MD), JM 110 and Sure™ 
15 (Stratagene, La Jolla, CA) cells were sometimes used amplify plasmid DNA for 
sequencing. Plasmids were transformed into these cells using the manufacturers' 
procedures. DNA was sequenced using the Sequenase® 2.0 DNA sequencing kit 
purchased from U. S. Biochemical Corporation (Cleveland, Ohio). The plasmids 
described in Section 6, their respective divergence from WT cry3Bb sequence, the 
20 resulting amino acid changes and the protein structure site of the changes are listed 
in Table 11. 
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5.27 Example 27 - Expression of Cry3Bb* Proteins 

5.27.1 Culture Conditions 

LB agar was prepared using a standard formula (Maniatis et ai, 1982). Starch 
agar was obtained from Difco Laboratories (Detroit, MI) and supplemented with an 
5 additional 5 g/1 of agar. C2 liquid medium is described by Donovan et ah (1988). C2 
medium was sometimes prepared without the phosphate buffer (C2-P). All cultures 
were incubated at 25°C to 30°C; liquid cultures were also shaken at 250 rpm, until 
sporulation and lysis had occurred. 

10 5.27.2 Transformation Conditions 

pEG1701 and derivatives thereof were introduced into acrystalliferious 
B. thuringiensis var. kurstaki EG7566 (Baum, 1994) or EG10368 (U. S. Patent 
5,322,687) by the electroporation method of Macaluso and Mettus (1991). In some 
cases, the method was modified as follows to maximize the number of transformants. 

15 The recipient B. thuringiensis strain was inoculated from overnight growth at 30°C on 
LB agar into brain heart infusion plus 0.5% glycerol, grown to an optical density of 
approximately 0,5 at 600 nm 5 chilled on ice for 10 min, washed 2X with EB and re- 
suspended in a 1/50 volume of EB. Transformed cells were selected on LB agar or 
starch agar plus 5 fig/ml chloramphenicol. Visual screening of colonies was used to 

20 identify transformants producing crystalline protein; those colonies were generally 
more opaque than colonies that did not produce crystalline protein. 

5.27.3 Strain and Protein Designations 

A transformant containing an altered cry3Bb* gene encoding an altered 
25 Cry3Bb* protein is designated by an "EG" number, e.g., EG 11231. The altered 
Cry3Bb* protein is designated Cry3Bb followed by the strain number, e.g., 
Cry3Bb. 11231. Collections of proteins with alterations at a structural site are desig- 
nated Cry3Bb followed by the structural site, e.g., Cry3Bb.ip2,3. Table 12 lists the 
plasmids pertinent to this invention, the new B. thuringiensis strains containing the 
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plasmids, the acrystalliferous B. thuringiensis recipient strain used, and the proteins 
produced by the new strains. 



5.28 Example 28 - Generation and Characterization of Cry3Bb-60 

5 5.28.1 Generation of Cry3Bb-60 

Cry3Bb-producing strain EG7231 (U. S. Patent 5,187,091) was grown in C2 
medium plus 3 mg/ml chloramphenicol. Following sporulation and lysis, the culture 
was washed with water and Cry3Bb protein purified by the NaBr solubilization and 
recrystallization method of Cody et ah (1992). Protein concentration was determined 

10 by BCA Protein Assay (Pierce, Rockford, IL). Recrystallized protein was solubilized 
in 10 ml of 50 mM KOH per 100 mg of Cry3Bb protein and buffered to pH 9.0 with 
100 mM CAPS (3-[cyclohexylamino]-l-propanesulfonic acid), pH 9.0. The soluble 
toxin was treated with trypsin at a weight ratio of 50 mg toxin to 1 mg trypsin for 20 
min to overnight at room temperature. Trypsin cleaves proteins on the carboxyl side 

15 of available arginine and lysine residues. For 8-dose bioassay, the solubilization 
conditions were altered slightly to increase the concentration of protein: 50 mM KOH 
was added dropwise to 2.7 ml of a 12.77 mg/ml suspension of purified Cry3Bb* until 
crystal solubilization occurred. The volume was then adjusted to 7 ml with 100 mM 
CAPS, pH 9.0. 

20 
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Table 12 

Plasmids Carrying Altered cry3Bb* Genes Transformed into 

b. thuringiensis 
for Expression of Altered Cry3Bb* Proteins 



Plasmid Designation 


New BT Strain 


tLXprCaJsvii rruiviu 


pEG1701 


EG11204 


WJT Pn/1RK 

w I uryj-DD 


pEG1701 


EG 11 037 


W 1 K^TyjOD 


pEG1707 


EG11221 


CryibD.i izzi 


pEG1708 


EG 11 222 


CryJBu.l izzz 


pEG1709 


EG 11 223 


CryiBb.l IZZ3 


pEG1710 


EG11224 


Cry3Bb.l lzZ4 


pEG1711 


EG11225 


Cry3Bb.llzz:> 


pEG1712 


EG 11 226 


Cry3Bb.ll2zo 


P EG1713 


EG 11227 


Cry3Bb.l 1ZZ/ 


pEG1714 


EG 11228 


CryJoD.i izzo 


pEG1715 


EG 11229 




pEG1716 


EG11230 




pEG1717 


EG11231 


Prv^RVi 1 1231 


pEG1718 


EG1 1232 


Prv^Rh 1 1232 


pEG1719 


EG11233 


Trv^Rh 1 1233 


pEG1720 


EAjl IZ34 


Crv3Bb 11234 


pEG1721 


cm 1 01^ 
EAJl IZjJ 


Crv3Bb 11235 


pEG1722 


bOl lzio 


frv3Bb 1 1236 


pEG1723 


EG11237 


Cry3Bb. 11237 


pEG1724 


EG11238 


Cry3Bb.ll238 


pEG1725 


EG11239 


Cry3Bb. 11239 


pEG1726 


EG11241 


Cry3Bb. 11241 


pEG1727 


EG11242 


Cry3Bb.ll242 


pEG1041 


EG11032 


Cry3Bb.ll032 


pEG1046 


EG11035 


Cry3Bb. 11035 


pEG1047 


EG11036 


Cry3Bb. 11036 
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Table 12 (Cont'd) 


Plasmid Designation 


New BT Strain 


Expressed Protein 


pEG1052 


EG 11 046 


Cry3Bb. 11046 


pEG1054 


EG11048 


Cry3Bb. 11048 


pEG1057 


EG11051 


Cry3Bb.ll051 


pEG1062 


EG11057 


Cry3Bb.ll057 


pEG1063 


EG11058 


Cry3Bb. 11058 


pEG1084 


EGU081 


Cry3Bb. 11081 


pEG1085 


EG11082 


Cry3Bb.ll082 


pEG1086 


EG11083 


Cry3Bb.ll083 


pEG1087 


EG11084 


Cry3Bb. 11084 


pEG1095 


EG11095 


Cry3Bb.ll095 


pEG1098 


EG 11 098 


Cry3Bb. 11098 


P EG1701.ip2,3 


collection of unnamed strains 


Cry3Bb.ip2,3 


pEG1701.1p6,7 


collection of unnamed strains 


Cry3Bb.lp6,7 


pEG1701.1|310,ll 


collection of unnamed strains 


Cry3Bb.ipiO,ll 



5,28.2 Determination of Molecular Weight of Cry3Bb-60 

The molecular weight of the predominant trypsin digestion fragment of 
5 Cry3Bb was determined to be 60 kDa by SDS-polyacrylamide gel electrophoresis 
(SDS-PAGE) analysis using commercial molecular weight markers. This digestion 
fragment is designated Cry3Bb-60. No further digestion of the 60 kDa cleavage 
product was observed. 

10 5.28.3 Determination of NH 2 -Terminus of Cry3Bb-60 

To determine the NH 2 -terminal sequence of Cry3Bb-60, the trypsin digest was 
fractionated by SDS-PAGE and transferred to Immobilon™-P membrane (Millipore 
Corporation, Bedford, MA) following standard western blotting procedures. After 
transfer, the membrane was rinsed twice with water then stained with 0.025% 
15 Coomassie Brilliant Blue R-250 plus 40% methanol for 5 min, destained with 50% 
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methanol and rinsed in water. The Cry3Bb.60 band was excised with a razor blade. 
NH 2 -terminal sequencing was performed at the Tufts Medical School, Department of 
Physiology (Boston, MA) using standard automated Edman degradation procedures. 
The NH 2 -terminal amino acid sequence was determined to be SKRSQDR (SEQ ID 
5 NO:96), corresponding to amino acids 160-166 of Cry3Bb. Trypsin digestion oc- 
curred on the carboxyl side of amino acid R159 resulting in the removal of helices 1- 
3. 



5.29 Example 29 ~ Bioacti vity of Cry3Bb * Proteins 

10 5.29.1 Culture Conditions and Protein Concentration Determination 

Cultures for 1-dose bioassays were grown in C2-P plus 5 |ig/ml chloram- 
phenicol (C2-P/cm5) then diluted with 3 volumes of 0.005% Triton X-100* The 
protein concentrations of these cultures were not determined. Cultures for 8-dose bio- 
assays were grown in C2/cm5, washed 1 - 2 times with 1 - 2 volumes of sterile water 
15 and resuspended in 1/10 volume of sterile 0.005% Triton X-100®. The toxin protein 
concentration of each concentrate was determined as described by Brussock and Cur- 
rier (1990), omitting the treatment with 3 M HEPES. The protein concentration was 
adjusted to 3.2 mg/ml in 0.005% Triton X-100® for the top dose of the assay. 
Cry3Bb,60 was produced and quantified for 8-dose assay as described in Section 9.1 . 

20 

5.29.2 Insect Bioassays 

Diabrotica undecimpunctata howardi Barber (southern corn rootworm or 
SCRW) and Diabrotica virgifera virgifiera LeConte (western corn rootworm or 
WCRW) larvae were reared as described by Slaney et al (1992). Eight-dose assays 
25 and probit analyses were performed as described by Slaney et al (1992). Thirty-two 
larvae were tested per dose at 50 \l\ of sample per well of diet (surface area of 175 
mm 2 ). Positive controls were WT Cry3Bb-producing strains EG1 1037 or EG11204. 
All bioassays were performed using 128-well trays containing approximately 1 ml of 
diet per well with perforated mylar sheet covers (C-D International Inc., Pitman, NJ). 
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One-dose assays were performed essentially the same except only 1 dose was tested 
per strain. All assay were replicated at least twice. 

5.29.3 Insect Bioassay Results: 1-Dose Assays Against SCRW 

5 Results from 1-dose assays are expressed as the relative mortality (RM) of the 

experimental strain compared to WT (% mortality of experimental culture divided by 
% mortality of WT culture). Altered and improved Cry3Bb proteins derived from 
plasmids constructed using PCR™ methods introducing random or semi-random 
changes into the cry3Bb gene sequence were distinguished from other altered but not 

10 improved Cry3Bb proteins by replicated, 1-dose assay against SCRW larvae. Those 
proteins showing increased activity (defined as RM > 1.5) compared to WT Cry3Bb 
or, in the case of proteins with combinations of altered sites, compared to a "parental" 
altered Cry3Bb protein were further characterized by 8-dose assay. The overall RM 
"pattern" produced by 1-dose assay results from a collection of proteins carrying ran- 

1 5 dom or semi-random alterations within a single structural region, e.g., in 102,3, can be 
used to determine if that structural region is important for bioactivity. Retention of 
WT levels of activity (RM * 1) indicate changes are tolerated in that region. Overall 
loss of activity (RM < 1) distinguishes the region as important for bioactivity. 

20 5.29.4 Cry3Bb.lP2,3: Results of 1-Dose Bioassays Against SCRW 

Cry3Bb.lp2.3 protein are a collection of proteins altered in the 102,3 region of 
Cry3Bb (see Section 5.3.4). Typical results of 1-dose assays of these altered proteins 
are shown in FIG. 12. The RM values for Cry3Bb.l(52,3 proteins are less than 1, with 
a few exceptions of values close to 1, indicating that this region is important for tox- 
25 icity. 

529.5 Cry3Bb.lP6,7: Results of 1-Dose Bioassays Against SCRW 

Cry3Bb.ip6,7 proteins are a collection of proteins altered in the lp6,7 region of 
Cry3Bb (see Section 5.3.5). Typical results of 1-dose assays of these altered proteins 
30 are shown in FIG. 13. With a few exceptions of values close to 1 , the RM values for 
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Cry3Bb.lp6,7 proteins are less than 1, indicating that this region is important for tox- 
icity. 

5.29.6 Cry3Bb.lP10,11 : Results of 1-Dose Bioassays Against SCRW 

5 Cry3Bb.l(5lO,l 1 proteins are a collection of proteins altered in the ip 10,1 1 re- 

gion of Cry3Bb (see Section 5.3.6). Typical results of 1-dose assays of these altered 
proteins are shown in FIG. 14. With a few exceptions of values close to 1, the RM 
values for Cry3Bb.ipi0,l 1 proteins are less than 1, indicating that this region is im- 
portant for bioactivity. 

10 

5.29.7 Insect Bioassay Results: Results of 8-Dose Assays Against SCRW 

Results from 8-dose assays are expressed as an LC 50 value (protein concentra- 
tion giving 50% mortality) with 95% confidence intervals. The LC 50 values with 95% 
confidence intervals of altered Cry3Bb proteins showing improved activities against 
1 5 SCRW larvae and LC 50 values of the WT Cry3Bb control determined at the same time 
are listed in Table 13 along with the fold increase over WT activity for each improved 
protein. 

Table 13 

20 Designed Cry3Bb proteins were tested against SCRW larvae in rep- 
licated, 8-dose assays to determine the LC 50 values 



Improved Protein 


LC 50 ng/well (95% CI.) 




Improved Protein 


WT Cry3Bb 
Control 


Fold Increase Over 
WT Activity 


Cry3Bb.60 


6.7 (5.3-8.4) 


24.1 (15-39) 


3.6x 


Cry3Bb. 11221 


3.2 (2.5-4) 


20.5 (14.5-29) 


6.4x 


Cry3Bb. 11222 


7.3 (6-9) 


29.4 (23-37) 


4.0x 


Cry3Bb. 11223 


10.5 (9-12) 


29.4 (23-37) 


2.8x 


Cry3Bb. 11224 


6.5 (5.1-8.2) 


32.5 (25-43) 


5.0x 


Cry3Bb. 11225 


13.7(11-16.8) 


49.5 (39-65) 


3.6x 
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Table 13 (Cont'd) 



LC 50 ng/well (95% CI.) 



Improved Protein 


Improved Protein 




Fold Increase 








Over WT Activity 


Cry3Bb.ll226 


16.7 (10.6-24.2) 


49.5 (39-oD) 




Cry3Bb. 11227 


11.1 (9.1-13.5) 


21.3 (16-28) 


1 Qx 


Cry3Bb.ll228 


8.0 (6.6-9.8) 


32.9 (25-45) 


A 1 v 
ix 


Cry3Bb. 11229 


7.2 (5.8-8.8) 


18.2 (15-22) 


z..?x 


Cry3Bb. 11230 


7.0 (5.8-8.6) 


32.9 (25-45) 


A 7v 
4. /X 


Cry3Bb. 11231 


3.3 (3.0-3.7) 


26.1 (22-31) 


7 Qv 


Cry3Bb. 11232 


6.4 (5.4-7.7) 


32.9 (25-45) 


C 1 

5. IX 


Cry3Bb. 11233 


15.7 (12-20) 


32.9 (25-45) 


z.zx 


Cry3Bb.ll234 


7(6-9) 


29 (22-39) 


/I 1 ^ 
4. ix 


Cry3Bb. 11235 


4.2 (3.6-4.9) 


13.3 (10-17) 


J.ZX 


Cry3Bb. 11236 


11.6(9-15) 


36.4 (27-49) 


3; ix 


Cry3Bb. 11237 


6.8 (4-11) 


36.4 (27-49) 


*s 4x 


Cry3Bb. 11238 


13.9(11-17) 


36.4 (27-49) 


? fix 


Cry3Bb.ll239 


13.0 (10-16) 


36.4 (27-4y) 


1 Rx 


Cry3Bb. 11241 


11 (7-16) 


or* ho to\ 

29 (22-39) 


9 fix 
Z.DX 


Cry3Bb. 11242 


11.9 (9.2-16) 


3U (zJoo; 


? Sx 


Cry3Bb. 11032 


4.2 (3.6-4.9) 


ti o /in m\ 

ID. 3 / ) 


3.1x 


Cry3Bb. 11035 


1 f\ *> /O 1 o\ 

10.3 (8-13) 


77 Q ni-^4\ 


2.7x 


Cry3Bb. 11036 


6.5 (5.1-7.9) 




4.3x 


Cry3Bb.l 1046 


12.1 (o-iy) 




2.6x 


Cry3Bb.ll048 


8.3 (6-11) 


1^ 4 /OA 


4.3x 


Cry3Bb. 11051 


11.8 (8-16) 


35.4 (24-53) 


3. Ox 


Cry3Bb. 11057 


8.8 (7-11) 


29.5 (24-36) 


3.4x 


Cry3Bb. 11058 


9.6 (6-14) 


33.4 (27-43) 


3.5x 


Cry3Bb. 11081 


8.5 (7-11) 


51.5 (37-79) 


6.1x 


Cry3Bb. 11082 


10.6 (8-13) 


51.5 (37-79) 


4.9x 


Cry3Bb. 11083 


7.0 (5-10) 


51.5 (37-79) 


7.4x 
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Table 13 (Cont'd) 





LC 50 ng/well (95% CI.) 




Improved Protein 


Improved Protein 


WT Cry3Bb 


Fold Increase 






Control 


Over WT Activity 


Cry3Bb. 11084 


7.2 (4-12) 


51.5 (37-79) 


7.2x 


Cry3Bb. 11095 


11.1 (9-14) 


51.5 (37-79) 


4.6x 


Cry3Bb. 11098 









5.29.8 Insect Bioassay Results: 8-Dose Assays Against WCRW 

WCRW larvae are delicate and difficult to work with. Therefore, only some of 
5 the designed Cry3Bb showing improved activity against SCRW larvae were also 
tested against WCRW larvae in 8-dose assays. The LC 50 determinations for the de- 
signed Cry3Bb proteins are shown in Table 14 along with the LC 50 values of the WT 
Cry3Bb control determined at the same time. 

10 Table 14 

Cry3Bb* Proteins Showing Improved Activity Against SCRW Larvae 
Also Show Improved Activity Against WCRW Larvae 





LC 50 ng/well (95% CI.) 




Improved Protein 


Improved Protein 


WT Cry3Bb 
Control 


Fold Increase 
Over WT Activity 


EG 11083 


6.3 (4.7-8.2) 


63.5 (46-91) 


lO.lx 


EG1 1230 


24.2(13-40) 


4.5(2.1-7.4) 


5.4x 


EG11231 


32.2 (14-67) 


2.5(1.7-3.6) 


12.9x 



5.30 Example 30 ~ Channel Activity 

15 Ion channels produced by Cry3Bb and some of its derivatives were measured 

by the methods described by Slatin et ai (1990). In some instances, lipid bilayers 
were prepared from a mixture of 4: 1 phophatidylethanolamine (PE) : phosphatidyl- 
choline (PC). Toxin protein was solubilized from washed, C2 medium, 
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B. thuringiensis cultures with 12 mM KOH. Following centrifugation to remove 
spores and other debris, 10 \x% of soluble toxin protein was added to the cis compart- 
ment (4.5 ml volume) of the membrane chamber. Protein concentration was deter- 
mined using the BCA Protein Assay (Pierce). 

5 

5.30.1 Channel activity of WT Cry3Bb 

Upon exposure to black lipid membranes, Cry3Bb forms ion channels with 
various conductance states. The channels formed by Cry3Bb are rarely discrete chan- 
nels with well resolved open and closed states and usually require incubation of the 

10 toxin with the membrane for 30 - 45 min before any channel-like events are observed. 
After formation of the initial conductances, the size increases from approximately 200 
pS to over 10,000 pS over 2 - 3 h. Only the small conductances (£ 200 pS) are volt- 
age dependent. Over 200 pS, the conductances are completely symmetric. Cry3Bb 
channels also exhibit P-mercaptoethanol-dependent activation, growing from small 

15 channel conductances of -200 pS to several thousand pS within 2 min of the addition 
of (J-mercaptoethanol to the cis compartment of the membrane chamber. 

5.30.2 CRY3BB.11032 

The channel activity of Cry3Bb. 11032 is much like WT Cry3Bb when the 
20 solubilized toxin protein is added to the cis compartment of the membrane chamber. 
However, when this protein is artificially incorporated into the membrane by forming 
or "painting" the membrane in the presence of the Cry3Bb.ll032 protein, a 16-fold 
increase in the initial channel conductances is observed (~ 4000 pS). This phenome- 
non is not observed with WT Cry3Bb. 

25 

5.30.3 CRY3BB.11035 

Upon exposure to artificial membranes, the Cry3Bb. 11035 protein spontane- 
ously forms channels that grow to large conductances within a relatively short time 
span (-5 min). Conductance values ranges from 3000 - 6000 pS and, like WT 
30 Cry3Bb, are voltage dependent at low conductance values. 
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530.4 Cry3Bb.11048 

The Cry3Bb. 11048 protein is quite different than WT Cry3Bb in that it ap- 
pears not to form channels at all, but, rather, forms symmetrical pores with respect to 
5 voltage. Once the pore is formed, it remains open and allows a steady conductance 
ranging from 25 to 130 pS. 

5.30.5 CRY3BB.11224 andCry3Bb.11226 

The metal binding site of WT Cry3Bb formed by H231 in the dimer structure 
10 was removed in proteins Cry3Bb.ll224 and Cry3Bb.ll226. The conductances 
formed by both designed proteins are identical to that of WT Cry3Bb with the excep- 
tion that neither of the designed proteins exhibits p-mercaptoethanol-dependent acti- 
vation. 

15 5.30.6 CRY3BB.11221 

Cry3Bb.l 1221 protein has been observed to immediately form small channels 
of 100 - 200 pS with limited voltage dependence. Some higher conductances were 
observed at the negative potential. In other studies, the onset of activity was delayed 
by 27 min, which is more typical for WT Cry3Bb. Unlike WT Cry3Bb, however, 
20 Cry3Bb.l 1221 forms well. resolved, 600 pS channels with long open states. The pro- 
tein eventually reaches conductances of 7000 pS. 

5.30.7 CRY3BB.11242 

Cry3Bb. 11242 protein forms small conductances immediately upon exposure 
25 to an artificial membrane. The conductances grow steadily and rapidly to 6000 pS in 
approximately 3 min. Some voltage dependence was noted with a preference for a 
negative imposed voltage. 
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5.30.8 Cry3Bb.11230 

Unlike WT Cry3Bb, Cry3Bb.ll230 forms well resolved channels with long 
open states that do not continue to grow in conductance with time. The maximum 
observed channel conductances reached 3000 pS. FIG. 15 illustrates the difference 
5 between the channels formed by Cry3Bb and Cry3Bb. 1 1230. 

530.9 Cry3Bb.60 

Cry3Bb.60 forms well resolved ion channels within 20 min of exposure to an 
artificial membrane. These channels grow in conductance and frequency with time. 
10 The behavior of Cry3Bb.60 in a planar lipid bilayer differs from Cry3Bb in two sig- 
nificant ways. The conductances created by Cry3Bb.60 form more quickly than 
Cry3Bb and, unlike Cry3Bb, the conductances are stable, having well resolved open 
and closed states definitive of stable ion channels (FIG. 16). 

1 5 5.31 Example 31 - Primer Compositions 



Table 15 


SEQ ID NO:83 


% of Nucleotide in mixture 


Code 


A T G C 


N 


25 25 25 25 


Table 16 


SEQ ID NO:84 


% of Nucleotide in mixture 


Code 


A T G C 


N 


25 25 25 25 



20 
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Table 17 



SEQ ID NO:85 % of Nucleotide in mixture 



Code 


A 


T 


G C 


B 


16 


16 


52 16 


D 


70 


10 


10 10 


N 


25 


25 


25 25 


Table 18 


SEQ ID NO:86 


% of Nucleotide in mixture 


Code 


A 


T 


G C 


E 


82 


6 


6 6 


F 


6 


6 


6 82 


J 


6 


82 


6 6 


I 


6 


6 


82 6 


N 


25 


25 


25 25 


Table 19 


SEQ ID NO:88 


% of Nucleotide in mixture 


Code 


A 


T 


G C 


J 


6 


82 


6 6 


E 


82 


6 


6 6 


H 


1 


1 


1 97 


I 


6 


6 


82 6 


K 


15 


15 


15 55 


R 


15 


55 


15 15 
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Table 20 



cirri in NO-Qft 


% of Nucleotide in mixture 


Code 


A 


T 


G 




J 




82 


6 


6 


F 


6 


6 


6 


82 


N 


25 


25 


25 


25 


E 


82 


6 


6 


6 


I 


6 


6 


82 


6 


L 


8 


1 


83 


8 


O 


1 


1 


1 


97 


Table 21 


SEQ ID NO:91 


% of Nucleotide in mixture 


Code 


A 


T 


G 


C 


J 


6 


82 


6 


6 


E 


82 


6 


6 


6 


H 


1 


1 


1 


97 


I 


6 


6 


82 


6 


N 


25 


25 


25 


25 


M 


82 


2 


8 


8 
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Table 22 
SEQ ID NO:92 



% of Nucleotide iivmixture 



Code 


A 


T 


G 


C 


J 


6 


82 


6 


0 


o 


0 


9 


82 


9 


r 


6 


6 


6 


82 


N 


25 


25 


25 


25 


E 


82 


6 


6 


6 


I 


6 


6 


82 


6 




Table 23 








SEQ ID NO:92 








% of Nucleotide in mixture 


Code 


A 


T 


G 


C 


J 


6 


82 


6 


6 


F 


6 


6 


6 


82 


N 


25 


25 


25 


25 


E 


82 


6 


6 


6 


I 


6 


6 


82 


6 
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Table 24 
SEQ ID NO:95 



Code 


% of Nucleotide in mixture 


A 


T 


G 


C 


J 


6 


82 


6 


6 


N 


25 


25 


25 


25 


E 


82 


6 


6 


6 


I 


6 


6 


82 


6 


M 


82 


2 


8 


8 


P 


8 


2 


8 


82 


S 


1 


97 


1 


1 



5.32 Example 32 - Atomic Coordinates for Cry3Bb 

5 The atomic coordinates of the Cry3Bb protein are given in the Appendix in- 

cluded in Section 9.1 

5.33 Example 33 - Atomic Coordinates for Cry3A 

The atomic coordinates of the Cry3A protein are given in the Appendix in- 
10 eluded in Section 9.2 

5.34 Example-34 - Modification of Cry Genes for Expression in Plants 
Wild-type cry genes are known to be expressed poorly in plants as a full length 

gene or as a truncated gene. Typically, the G+C content of a cry gene is low (37%) 
15 and often contains many A+T rich regions, potential polyadenylation sites and nu- 
merous ATTTA sequences. Table 25 shows a list of potential polyadenylation se- 
quences which should be avoided when preparing the "plantized" gene construct. 



WO 99/31248 



PCT/US98/26852 



185 

Table 25 

List Of Sequences Of The Potential Polyadenylation Signals 



AATAAA* 


AAGCAT 


AATAAT* 


ATTAAT 


AACCAA 


ATACAT 


ATATAA 


AAAATA 


AATCAA 


ATTAAA** 


ATACTA 


AATTAA** 


ATAAAA 


AATACA** 


ATGAAA 


CATAAA** 



* indicates a potential major plant polyadenylation site. 
** indicates a potential minor animal polyadenylation site. 
5 All others are potential minor plant polyadenylation sites. 

The regions for mutagenesis may be selected in the following manner. All 
regions of the DNA sequence of the cry gene are identified which contained five or 
more consecutive base pairs which were A or T. These were ranked in terms of length 

10 and highest percentage of A+T in the surrounding sequence over a 20-30 base pair 
region. The DNA is analysed for regions which might contain polyadenylation sites 
or ATTTA sequences. Oligonucleotides are then designed which maximize the elimi- 
nation of A+T consecutive regions which contained one or more polyadenylation sites 
or ATTTA sequences. Two potential plant polyadenylation sites have been shown to 

1 5 be more critical based on published reports. Codons are selected which increase G+C 
content, but do not generate restriction sites for enzymes useful for cloning and as- 
sembly of the modified gene (e.g., BamHl, Sac\, Ncol, EcoRV, etc.). Likewise 
condons are avoided which contain the doublets TA or GC which have been reported 
to be infrequently-found codons in plants. 

20 Although the CaMV35S promoter is generally a high level constitutive pro- 

moter in most plant tissues, the expression level of genes driven the CaMV35S pro- 
moter is low in floral tissue relative to the levels seen in leaf tissue. Because the eco- 
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nomically important targets damaged by some insects are the floral parts or derived 
from floral parts (e.g., cotton squares and bolls, tobacco buds, tomato buds and fruit), 
it is often advantageous to increase the expression of crystal proteins in these tissues 
over that obtained with the CaMV35S promoter. 
5 The 35S promoter of Figwort Mosaic Virus (FMV) is analogous to the 

CaMV35S promoter. This promoter has been isolated and engineered into a plant 
transformation vector. Relative to the CaMV promoter, the FMV 35S promoter is 
highly expressed in the floral tissue, while still providing similar high levels of gene 
expression in other tissues such as leaf. A plant transformation vector, may be con- 
10 structed in which the full length synthetic cry gene is driven by the FMV 35S pro- 
moter. Tobacco plants may be transformed with the vector and compared for expres- 
sion of the crystal protein by Western blot or ELISA immunoassay in leaf and floral 
tissue. The FMV promoter has been used to produce relatively high levels of crystal 
protein in floral tissue compared to the CaMV promoter. 

15 

5.35 Example 35 Expression of Synthetic cry Genes with ssRUBISCO 
Promoters and Chloroplast Transit Peptides 

The genes in plants encoding the small subunit of RUBISCO (SSU) are often 
highly expressed, light regulated and sometimes show tissue specificity. These ex- 

20 pression properties are largely due to the promoter sequences of these genes. It has 
been possible to use SSU promoters to express heterologous genes in transformed 
plants. Typically a plant will contain multiple SSU genes, and the expression levels 
and tissue specificity of different SSU genes will be different. The SSU proteins are 
encoded in the nucleus and synthesized in the cytoplasm as precursors that contain an 

25 N-terminal extension known as the chloroplast transit peptide (CTP). The CTP di- 
rects the precursor to the chloroplast and promotes the uptake of the SSU protein into 
the chloroplast. In this process, the CTP is cleaved from the SSU protein. These CTP 
sequences have been used to direct heterologous proteins into chloroplasts of trans- 
formed plants. 
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The SSU promoters might have several advantages for expression of heterolo- 
gous genes in plants. Some SSU promoters are very highly expressed and could give 
rise to expression levels as high or higher than those observed with the CaMV35S 
promoter. The tissue distribution of expression from SSU promoters is different from 
5 that of the CaMV35S promoter, so for control of some insect pests, it may be advan- 
tageous to direct the expression of crystal proteins to those cells in which SSU is most 
highly expressed. For example, although relatively constitutive, in the leaf the 
CaMV35S promoter is more highly expressed in vascular tissue than in some other 
parts of the leaf, while most SSU promoters are most highly expressed in the meso- 

10 phyll cells of the leaf. Some SSU promoters also are more highly tissue specific, so it 
could be possible to utilize a specific SSU promoter to express the protein of the pres- 
ent invention in only a subset of plant tissues, if for example expression of such a 
protein in certain cells was found to be deleterious to those cells. For example, for 
control of Colorado potato beetle in potato, it may be advantageous to use SSU pro- 

15 moters to direct crystal protein expression to the leaves but not to the edible tubers. 

Utilizing SSU CTP sequences to localize crystal proteins to the chloroplast 
might also be advantageous. Localization of the B. thuringiensis crystal proteins to 
the chloroplast could protect these from proteases found in the cytoplasm. This could 
stabilize the proteins and lead to higher levels of accumulation of active toxin, cry 
20 genes containing the CTP could be used in combination with the SSU promoter or 
with other promoters such as CaMV35S. 

5.36 Example 36 Targeting of Cry* Proteins to the Extracellular 
Space or Vacuole through the Use of Signal Peptides 

25 The B. thuringiensis proteins produced from the synthetic genes described 

here are localized to the cytoplasm of the plant cell, and this cytoplasmic localization 
results in plants that are insecticidally effective. It may be advantageous for some 
purposes to direct the B. thuringiensis proteins to other compartments of the plant cell. 
Localizing B. thuringiensis proteins in compartments other than the cytoplasm may 

30 result in less exposure of the B. thuringiensis proteins to cytoplasmic proteases lead- 
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ing to greater accumulation of the protein yielding enhanced insecticidal activity. 
Extracellular localization could lead to more efficient exposure of certain insects to 
the B. thuringiensis proteins leading to greater efficacy. If a ZJ. thuringiensis protein 
were found to be deleterious to plant cell function, then localization to a noncyto- 
5 plasmic compartment could protect these cells from the protein. 

In plants as well as other eukaryotes, proteins that are destined to be localized 
either extracellularly or in several specific compartments are typically synthesized 
with an N-terminal amino acid extension known as the signal peptide. This signal 
peptide directs the protein to enter the compartmentalization pathway, and it is typi- 
10 cally cleaved from the mature protein as an early step in compartmentalization. For 
an extracellular protein, the secretory pathway typically involves cotranslational in- 
sertion into the endoplasmic reticulum with cleavage of the signal peptide occurring at 
this stage. The mature protein then passes through the Golgi body into vesicles that 
fuse with the plasma membrane thus releasing the protein into the extracellular space. 
15 Proteins destined for other compartments follow a similar pathway. For example, 
proteins that are destined for the endoplasmic reticulum or the Golgi body follow this 
scheme, but they are specifically retained in the appropriate compartment. In plants, 
some proteins are also targeted to the vacuole, another membrane bound compartment 
in the cytoplasm of many plant cells. Vacuole targeted proteins diverge from the 
20 above pathway at the Golgi body where they enter vesicles that fuse with the vacuole. 

A common feature of this protein targeting is the signal peptide that initiates 
the compartmentalization process. Fusing a signal peptide to a protein will in many 
cases lead to the targeting of that protein to the endoplasmic reticulum. The effi- 
ciency of this step may depend on the sequence of the mature protein itself as well. 
25 The signals that direct a protein to a specific compartment rather than to the extracel- 
lular space are not as clearly defined. It appears that many of the signals that direct 
the protein to specific compartments are contained within the amino acid sequence of 
the mature protein. This has been shown for some vacuole targeted proteins, but it is 
not yet possible to define these sequences precisely. It appears that secretion into the 
30 extracellular space is the "default" pathway for a protein that contains a signal se- 
quence but no other compartmentalization signals. Thus, a strategy to direct 



WO 99/31248 



PCT/US98/26852 



189 

B. thuringiensis proteins out of the cytoplasm is to fuse the genes for synthetic 
B. thuringiensis genes to DNA sequences encoding known plant signal peptides. 
These fusion genes will give rise to B. thuringiensis proteins that enter the secretory 
pathway, and lead to extracellular secretion or targeting to the vacuole or other com- 
5 partments. 

Signal sequences for several plant genes have been described. One such se- 
quence is for the tobacco pathogenesis related protein PRIb has been previously de- 
scribed (Cornelissen et al 9 1986). The PRIb protein is normally localized to the ex- 
tracellular space. Another type of signal peptide is contained on seed storage proteins 

10 of legumes. These proteins are localized to the protein body of seeds, which is a 
vacuole like compartment found in seeds. A signal peptide DNA sequence for the 
P-subunit of the 7S storage protein of common bean (Phaseolus vulgaris), PvuB has 
been described (Doyle et al. 9 1986). Based on the published these published se- 
quences, genes may be synthesized chemically using oligonucleotides that encode the 

15 signal peptides for PRIb and PvuB. In some cases to achieve secretion or compart- 
mentalization of heterologous proteins, it may be necessary to include some amino 
acid sequence beyond the normal cleavage site of the signal peptide. This may be 
necessary to insure proper cleavage of the signal peptide. 



20 5.37 Example 37 - Isolation of Transgenic Maize Resistant to 
Diabrotica spp. Using Cry3Bb Variants 

5-37.1 Plant Gene Construction 

The expression of a plant gene which exists in double-stranded DNA form in- 
volves transcription of messenger RNA (mRNA) from one strand of the DNA by 
25 RNA polymerase enzyme, and the subsequent processing of the mRNA primary tran- 
script inside the nucleus. This processing involves a 3' non-translated region which 
adds polyadenylate nucleotides to the 3' end of the RNA. Transcription of DNA into 
mRNA is regulated by a region of DNA usually referred to as the "promoter". The 
promoter region contains a sequence of bases that signals RNA polymerase to associ- 



I 
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ate with the DNA and to initiate the transcription of mRNA using one of the DNA 
strands as a template to make a corresponding strand of RNA. 

A number of promoters which are active in plant cells have been described in 
the literature. Such promoters may be obtained from plants or plant viruses and in- 
5 elude, but are not limited to, the nopaline synthase (NOS) and octopine synthase 
(OCS) promoters (which are carried on tumor-inducing plasmids of Agrobacterium 
tumefaciens), the cauliflower mosaic virus (CaMV) 19S and 35S promoters, the light- 
inducible promoter from the small subunit of ribulose 1,5-bisphosphate carboxylase 
(ssRUBISCO, a very abundant plant polypeptide), and the Figwort Mosaic Virus 

10 (FMV) 35S promoter. All of these promoters have been used to create various types 
of DNA constructs which have been expressed in plants (see e.g., U. S. Patent No. 
5,463,175, specifically incorporated herein by reference). 

The particular promoter selected should be capable of causing sufficient ex- 
pression of the enzyme coding sequence to result in the production of an effective 

15 amount of protein. One set of preferred promoters are constitutive promoters such as 
the CaMV35S or FMV35S promoters that yield high levels of expression in most 
plant organs (U. S. Patent No. 5,378,619, specifically incorporated herein by refer- 
ence). Another set of preferred promoters are root enhanced or specific promoters 
such as the CaMV derived 4 as-1 promoter or the wheat POX1 promoter (U. S. Patent 

20 No. 5,023,179, specifically incorporated herein by reference; Hertig et al. 9 1 991). The 
root enhanced or specific promoters would be particularly preferred for the control of 
corn rootworm (Didbroticus spp.) in transgenic corn plants. 

The promoters used in the DNA constructs (/.£?. chimeric plant genes) of the 
present invention may be modified, if desired, to affect their control characteristics. 

25 For example, the CaMV35S promoter may be ligated to the portion of the ssRU- 
BISCO gene that represses the expression of ssRUBISCO in the absence of light, to 
create a promoter which is active in leaves but not in roots. The resulting chimeric 
promoter may be used as described herein. For purposes of this description, the 
phrase "CaMV35S" promoter thus includes variations of CaMV35S promoter, e.g., 

30 promoters derived by means of ligation with operator regions, random or controlled 
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mutagenesis, etc. Furthermore, the promoters may be altered to contain multiple 
"enhancer sequences" to assist in elevating gene expression. 

The RNA produced by a DNA construct of the present invention also contains 
a 5' non-translated leader sequence. This sequence can be derived from the promoter 
5 selected to express the gene, and can be specifically modified so as to increase trans- 
lation of the mRNA. The 5' non-translated regions can also be obtained from viral 
KNA's, from suitable eucaryotic genes, or from a synthetic gene sequence. The pres- 
ent invention is not limited to constructs wherein the non-translated region is derived 
from the 5' non-translated sequence that accompanies the promoter sequence. 

10 For optimized expression in monocotyledenous plants such as maize, an intron 

should also be included in the DNA expression construct. This intron would typically 
be placed near the 5' end of the mRNA in untranslated sequence. This intron could be 
obtained from, but not limited to, a set of introns consisting of the maize hsp70 intron 
(U. S. Patent No. 5,424,412; specifically incorporated herein by reference) or the rice 

15 Actl intron (McElroy et aL 9 1990). As shown below, the maize hsp70 intron is useful 
in the present invention. 

As noted above, the 3' non-translated region of the chimeric plant genes of the 
present invention contains a polyadenylation signal which functions in plants to cause 
the addition of adenylate nucleotides to the 3' end of the RNA. Examples of preferred 

20 3' regions are (1) the 3' transcribed, non-translated regions containing the polyadeny- 
late signal of Agrobacterium tumor-inducing (Ti) plasmid genes, such as the nopaline 
synthase (NOS) gene and (2) plant genes such as the pea ssRUBISCO E9 gene 
(Fischhoffe/tf/., 1987). 

25 5.37.2 Plant Transformation and Expression 

A chimeric plant gene containing a structural coding sequence of the present 
invention can be inserted into the genome of a plant by any suitable method. Suitable 
plant transformation vectors include those derived from a Ti plasmid of Agrobacte- 
rium tumefaciens, as well as those disclosed, e.g., by Herrera-Estrella (1983), Bevan 
30 (1983), Klee (1985) and Eur. Pat. Appl. Publ. No. EP0120516. In addition to plant 
transformation vectors derived from the Ti or root-inducing (Ri) plasmids of Agrobac- 
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terium, alternative methods can be used to insert the DN A constructs of this invention 
into plant cells. Such methods may involve, for example, the use of liposomes, elec- 
troporation, chemicals that increase free DNA uptake, free DNA delivery via micro- 
projectile bombardment, and transformation using viruses or pollen (Fromm et al, 
5 1986; Armstrong et al. 9 1990; Fromm et al 9 1990). 

5.37.3 Construction of Monocot Plant Expression Vectors for cry3Bb 
Variants 

5.37.3.1 Design Of Cry3bb Variant Genes For Plant Expression 

10 For efficient expression of the cry3Bb variants in transgenic plants, the gene 

encoding the variants must have a suitable sequence composition (Diehn et al> 1996). 
One example of such a sequence is shown for the vl 1231 gene (SEQ ID NO:99) 
which encodes the Cry3Bbll231 variant protein (SEQ ID NO: 100) with Diabrotica 
activity. This gene was derived via mutagenesis (Kunkel, 1 985) of a cry3Bb synthetic 

15 gene (SEQ ID NO: 101) encoding a protein essentially homologous to the protein en- 
coded by the native cry3Bb gene (Gen Bank Accession Number m89794, SEQ ID 
NO: 102). The following oligonucleotides were used in the mutagenesis of the origi- 
nal cry3Bb synthetic gene (SEQ ID NO.iOl) to create the vll231 gene (SEQ ID 
NO:99): 

20 01igo#l: 

5'-TAGGCCTCCATCCATGGCAAACCCTAACAATC-3' (SEQ ID NO: 103) 
01igo#2: 

5'-TCCCATCTTCCTACTTACGACCCTGCAGAAATACGGTCCAAC -3 r 
(SEQ ID NO: 104) 
25 01igo#3: 

5'-GACCTCACCTACCAAACATTCGATCTTG -3' (SEQ ID NO: 105) 
Oligo #4: 

5'-CGAGTTCTACCGTAGGCAGCTCAAG-3' (SEQ ID NO: 106) 
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5.37.3.2 Construction of Cry3Bb Monocot Plant Expression Vector 

To place the cry3Bb variant gene vl 1231 in a vector suitable for expression in 
monocotyledonous plants (i.e. under control of the enhanced Cauliflower Mosaic Vi- 
rus 35S promoter and link to the hsp70 intron followed by a nopaline synthase poly- 
5 adenylation site as in U. S. Patent No. 5,424,412, specifically incorporated herein by 
reference), the vector pMON 19469 was digested with Ncol and EcoRl. The larger 
vector band of approximately 4.6 kb was electrophoresed, purified, and ligated with 
T4 DNA ligase to the Ncol-EcoRl fragment of approximately 2 kb containing the 
vl 1231 gene (SEQ ID NO:99). The ligation mix was transformed into E. coli, carbe- 

10 nicillin resistant colonies recovered and plasmid DNA recovered by DNA miniprep 
procedures. This DNA was subjected to restriction endonuclcase analysis with en- 
zymes such as Ncol and EcoRl (together), Noil, and Pstl to identify clones containing 
pMON33708 (the vll231 coding sequence fused to the hsp70 intron under control of 
the enhanced CaMV35S promoter). 

15 To place the vll231 gene in a vector suitable for recovery of stably trans- 

formed and insect resistant plants, the 3.75-kb Notl restriction fragment from 
pMON33708 containing the lysine oxidase coding sequence fused to the hsp70 intron 
under control of the enhanced CaMV35S promoter was isolated by gel electrophoresis 
and purification. This fragment was ligated with pMON30460 treated with Notl and 

20 calf intestinal alkaline phosphatase (pMON30460 contains the neomycin phos- 
photransferase coding sequence under control of the CaMV35S promoter). Ka- 
namycin resistant colonies were obtained by transformation of this ligation mix into 
& coli and colonies containing pMON33710 identified by restriction endonuclease 
digestion of plasmid miniprep DNAs. Restriction enzymes such as Notl, EcoRV, 

25 HindUh Ncol, £coRI, and BgUl can be used to identify the appropriate clones contain- 
ing the Notl fragment of pMON33708 in the Notl site of pMON30460 (i.e. 
pMON33710) in the orientation such that both genes are in tandem (i.e. the 3' end of 
the vl 123 1 expression cassette is linked to the 5' end of the nptll expression cassette). 
Expression of the v 11231 protein by pMON33710 in corn protoplasts was confirmed 

30 by electroporation of pMON3371 0 DNA into protoplasts followed by protein blot and 
ELISA analysis. This vector can be introduced into the genomic DNA of corn em- 
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bryos by particle gun bombardment followed by paromomycin selection to obtain 
corn plants expressing the vl 1231 gene essentially as described in U. S. Patent No. 
5,424,412, specifically incorporated herein by reference. 

In this example, the vector was introduced via cobombardment with a hygro- 
5 mycin resistance conferring plasmid into immature embryo scutella (IES) of maize, 
followed by hygromycin selection, and regeneration. Transgenic corn lines express- 
ing the vl 1231 protein were identified by ELISA analysis. Progeny seed from these 
events were subsequently tested for protection from Diabrotica feeding. 

10 5.373.3 In plant a Performance of Cry3Bb.1 1231 

Transformed com plants expressing Cry3Bb. 11231 protein were challenged 
with western corn rootworm (WCR) larvae in both a seedling and 10 inch pot assay. 
The transformed genotype was A634, where the progeny of the R0 cross by A634 was 
evaluated. Observations included effect on larval development (weight), root damage 

15 rating (RDR), and protein expression. The transformation vector containing the 
cry3Bb gene was pMON33710. Treatments included the positive and negative iso- 
populations for each event and an A634 check. 

The seedling assay consisted of the following steps: (i) single seeds were 
placed in 1 oz cups containing potting soil; (ii) at spiking, each seedling was infested 

20 with 4 neonate larvae; and (iii) after infestation, seedlings were incubated for 7 days at 
25°C, 50% RH, and 14:10 (L:D) photo period. Adequate moisture was added to the 
potting soil during the incubation period to maintain seedling vigor. 

The 10 inch pot assay consisted of the following steps: (i) single seeds were 
placed in 10 inch pots containing potting soil; (ii) at 14 days post planting, each pot 

25 was infested with 800 eggs which have been pre-incubated such that hatch would oc- 
cur 5-7 days post infestation; and (iii) after infestation, plants were incubated for 4 
weeks under the same environmental conditions as the seedling assay. Pots were both 
sub and top irrigated daily. 

For the seedling assay, on day 7 plants were given a root damage rating, and 

30 surviving larvae were weighed. Also at this time, Cry3Bb protein concentrations in 
the roots were determined by ELISA. The scale used for the seedling assay to assess 
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root damage is as follows: RDR (root damage rating) 0 = no visible feeding; RDR 1 = 
very light feeding; RDR 2 = light feeding; RDR 3 = moderate feeding; RDR 4 = 
heavy feeding; and RDR 5 = very heavy feeding. 

Results of the seedling assay are shown in Table 26. Plants expressing 
5 Cry3Bb protein were completely protected by WCR feeding, where surviving larvae 
within this treatment had not grown. Mean larval weights ranged from 2.03-2.73 mg 
for the nonexpressing treatments, where the surviving larval average weight was 0.1 1 
mg on the expressing cry3Bb treatment. Root damage ratings were 3.86 and 0.33 for 
the nonexpressing and expressing isopopulations, respectively. Larval survival 
10 ranged from 75-85% for the negative and check treatments, where only 25% of the 
larvae survived on the Cry3Bb treatment. 

Table 26 

Effect of Cry3Bb Expressing Plants on 
1 5 WCR Larvae in a Seedling Assay 



Event 






Plants 






Larvae 


Treatment 


N 


Root 
(ppm) 


RDR±SD 


N 


% 
Surv 


Mean±SD 
Wt. (mg) 


16 


Negative 


7 


0.0 


3.86±0.65 


21 


75 


2.73±1.67 


16 


Positive 


3 


29.01 


0.3310.45 


3 


25 


0.11 ±0.07 


A634 


Check 


4 


0.0 




13 


81 


2.03±0.83 



For the 10 inch pot assay, at 4 weeks post infestation plant height was re- 
corded and a root damage rating (Iowa 1-6 scale; Hills and Peters, 1971) was given. 

Results of the 10 inch pot assay are shown in Table 27. Plants expressing 
20 Cry3Bb protein had significantly less feeding damage and were taller than the non- 
expressing plants. Event 16, the higher of the two expressing events provided nearly 
complete control. The negative treatments had very high root damage ratings indicat- 
ing very high insect pressure. The positive mean root damage ratings were 3.4 and 
2.2 for event 6 and 16, respectively. Mean RDR for the negative treatment was 5.0 
25 and 5.6. 
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Table 27 

Effect of Cry3bb Expressing Corn in Controlling 
Wcr Larval Feeding in a 10 Inch Pot Assay 



Event 


Treatment 


N 


Root 
(ppm) 


RDR+SD 


Plant 
Height (cm) 


6 


Negative 


7 


0.0 


5.011.41 


49.7+18.72 


6 


Positive 


5 


7.0 


3.4±1.14 


73.9±8.67 


16 


Negative 


5 


0.0 


5.6±0.89 


61.2±7.75 


16 


Positive 


5 


55.0 


2.2±0.84 


83.8±7.15 



In summary, corn plants expressing Cry3Bb protein have a significant biologi- 
cal effect on WCR larval development as seen in the seedling assay. When chal- 
lenged with very high infestation levels, plants expressing the Cry3Bb protein were 
protected from WCR larval feeding damage as illustrated in the 10 inch pot assay. 



10 



6.0 Brief Description of the Sequence Identifiers 

SEQ ID NO: 1 DNA sequence of cry3Bb.l 1221 gene. 

SEQ ID NO:2 Amino acid sequence of Cry3Bb. 1 122 1 polypeptide. 

SEQ ID NO:3 DNA sequence of cry3Bb. 11222 gene. 

1 5 SEQ ID NO:4 Amino acid sequence of Cry3Bb. 1 1 222 polypeptide. 

SEQIDNO:5 DNA sequence of cry3Bb. 11223 gene. 

SEQ ID NO:6 Amino acid sequence of Cry3Bb. 1 1 223 polypeptide. 

SEQ ID NO:7 DNA sequence of cry3Bb. 1 1224 gene. 

SEQ ID NO:8 Amino acid sequence of Cry3Bb. 1 1 224 polypeptide. 

20 SEQ ID NO:9 DNA sequence of cry3Bb, 1 1225 gene. 

SEQ ID NO:10 Amino acid sequence of Cry3Bb.l 1225 polypeptide. 

SEQ ID NO: 1 1 DNA sequence of cry3Bb. 11226 gene. 

SEQ ID NO: 12 Amino acid sequence of Cry3Bb. 1 1 226 polypeptide. 

SEQ ID NO:13 DNA sequence of cry3Bb. 1 1227 gene. 
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SEQ ID NO : 1 4 Amino acid sequence of Cry 3Bb . 1 1 227 polypeptide. 

SEQ ID NO: 1 5 DNA sequence of cry3Bb. 11228 gene. 

SEQ ID NO: 1 6 Amino acid sequence of Cry3Bb. 1 1 228 polypeptide. 

SEQ ID NO: 1 7 DNA sequence of cry3Bb.ll229 gene. 

5 SEQ ID NO: 1 8 Amino acid sequence of Cry3Bb. 1 1 229 polypeptide. 

SEQ ID NO: 1 9 DNA sequence of cry3Bb. 11230 gene. 

SEQ ID NO:20 Amino acid sequence of Cry3Bb.l 1230 polypeptide. 

SEQ ID NO:2 1 DNA sequence of cry3Bb. 11231 gene. 

SEQ ID NO:22 Amino acid sequence of Cry3Bb. 11231 polypeptide. 

10 SEQ ID NO:23 DNA sequence of cry3Bb. 11232 gene. 

SEQ ID NO:24 Amino acid sequence of Cry3Bb. 11232 polypeptide. 

SEQ ID NO:25 DNA sequence of cry3Bb. 11233 gene. 

SEQ ID NO:26 Amino acid sequence of Cry3Bb. 11233 polypeptide. 

SEQ ID NO:27 DNA sequence of cry3Bb.H234genc. 

1 5 SEQ ID NO:28 Amino acid sequence of Cry3Bb. 11234 polypeptide. 

SEQ ID NO:29 DNA sequence of cry3Bb. 1 1235 gene. 

SEQ ID NO:30 Amino acid sequence of Cry3Bb. 1 1235 polypeptide. 

SEQ ID NO:3 1 DNA sequence of cry3Bb. 11236 gene. 

SEQ ID NO:32 Amino acid sequence of Cry3Bb. 11236 polypeptide. 

20 SEQ ID NO: 3 3 DNA sequence of cty3Bb. 11237 gene. 

SEQ ID NO:34 Amino acid sequence of Cry3Bb.l 1237 polypeptide. 

SEQ ID NO:35 DNA sequence of cry3Bb.U238 gene. 

SEQ ID NO:36 Amino acid sequence of Cry3Bb. 11238 polypeptide. 

SEQ ID NO:37 DNA sequence of cry3Bb. 11239 gene. 

25 SEQ ID NO:38 Amino acid sequence of Cry3Bb. 1 1 239 polypeptide. 

SEQ ID NO:39 DNA sequence of cry3Bb. 11241 gene. 

SEQ ID NO:40 Amino acid sequence of Cry3Bb. 1 1 24 1 polypeptide. 

SEQ ID NO:41 DNA sequence oicry3Bb. 11242 gene. 

SEQ ID NO:42 Amino acid sequence of Cry3Bb. 1 1242 polypeptide. 

30 SEQ ID NO :43 DNA sequence of cry3Bb. 1 1 032 gene. 

SEQ ID NO:44 Amino acid sequence of Cry3Bb. 1 1 032 polypeptide. 
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SEQ ID NO:45 DN A sequence of cry3Bb. 1 1 035 gene. 

SEQ ID NO:46 Amino acid sequence of Cry3Bb. 1 1 035 polypeptide. 

SEQ ID NO:47 DNA sequence of cry3Bb. 11036 gene. 

SEQ ID NO:48 Amino acid sequence of Cry3Bb. 1 1 036 polypeptide. 

5 SEQ ID NO:49 DNA sequence of cry3Bb. 1 J 046 gene. 

SEQ ID NO:50 Amino acid sequence of Ciy3Bb. 1 1 046 polypeptide. 

SEQ ID NO:51 DNA sequence of cry3Bb.U048 gene. 

SEQ ID NO:52 Amino acid sequence of Cry3Bb. 1 1 048 polypeptide. 

SEQ ID NO:53 DNA sequence of cry3Bb. 11051 gene. 

1 0 SEQ ID NO:54 Amino acid sequence of Cry3Bb. 1 1 05 1 polypeptide. 

SEQ ID NO:5 5 DNA sequence of cry3Bb. 11057 gene. 

SEQ ID NO:56 Amino acid sequence of Cry3Bb. 1 1 057 polypeptide. 

SEQ ID NO:57 DNA sequence of cry3Bb. 11058 gene. 

SEQ ID NO:58 Amino acid sequence of Cry3Bb.ll058 polypeptide. 

15 SEQIDNO-.59 DNA sequence of cty3Bb. 11 081 gene. 

SEQ ID NO:60 Amino acid sequence of Cry3Bb. 1 1 081 polypeptide. 

SEQ ID NO:6 1 DNA sequence of cry3Bb. 1 1082 gene. 

SEQ ID NO:62 Amino acid sequence of Cry3Bb. 1 1 082 polypeptide. 

SEQ ID NO:63 DNA sequence of cry3Bb.U083 gene. 

20 SEQ ID NO:64 Amino acid sequence of Cry3Bb. 1 1 083 polypeptide. 

SEQ ID NO:65 DNA sequence of cry 3 Bb. 11 08 '4 gene. 

SEQ ID NO:66 Amino acid sequence of Cry3Bb. 1 1 084 polypeptide. 

SEQ ID NO:67 DNA sequence of cry3Bb. 1 1095 gene. 

SEQ ID NO:68 Amino acid sequence of Cry3Bb. 1 1 095 polypeptide. 

25 SEQIDNO:69 DNA sequence of cry3Bb. 60 gene. 

SEQ ID NO:70 Amino acid sequence of Cry3Bb.60 polypeptide. 

SEQ ID NO:71 Primer FW001 . 

SEQIDN0.72 Primer FW006. 

SEQIDNO:73 Primer MVT095. 

30 SEQIDNO:74 Primer MVT097. 

SEQ ID NO:75 Primer MVT091 . 
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SEQ ID NO:76 Primer MVT075. 

SEQ ID NO:77 Primer MVT076. 

SEQ ID NO:78 Primer MVT1 1 1 . 

SEQ ID NO:79 Primer MVT094. 

5 SEQIDNO:80 Primer MVT1 03. 

SEQ ID NO:8 1 Primer MVT08 1 . 

SEQ ID NO:82 Primer MVT085. 

SEQIDNO:83 Primer A. 

SEQIDNO:84 Primer B. 

10 SEQIDNO:85 Primer C. 

SEQ ID NO:86 Primer D. 

SEQIDNO:87 Primer E. 

SEQIDNO:88 Primer F. 

SEQIDNO:89 Primer G. 

1 5 SEQ ID NO:90 Primer WD 1 1 2. 

SEQIDNO:91 Primer WD1 15. 

SEQIDNO:92 Primer MVT105. 

SEQ ID NO:93 Primer MVT092. 

SEQ ID NO:94 Primer MVT070. 

20 SEQ ID NO:95 Primer MVT083. 

SEQ ID NO:96 N-terminal amino acid of Cry3Bb polypeptide. 

SEQ ID NO:97 DNA sequence of wild-type cry3Bb gene. 

SEQ ID NO:98 Amino acid sequence of wild-type Cry3Bb polypeptide. 

SEQ ID NO:99 Plantized DNA sequence for cry3Bb. 11231 gene. 

25 SEQ ID NO:100 Amino acid sequence of plantized Cry3Bb.l 1231 polypep- 

tide. 

SEQ ID NO: 101 DNA sequence of cry3Bb gene used to prepare SEQ ID 

NO:99. 

SEQ ID NO: 1 02 DNA sequence of wild-type cry3Bb gene, Genbank 
30 #M89794. 

SEQ ID NO: 1 03 DNA sequence of Oligo #1 . 
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SEQ ID NO: 104 DNA sequence of Oligo #2. 

SEQ ID NO: 105 DNA sequence of Oligo #3. 

SEQ ID NO: 106 DNA sequence of Oligo #4. . 

SEQ ID NO: 1 07 DNA sequence of cry3Bb. J 1 098 gene. 

SEQ ID NO: 108 Amino acid sequence of Cry3Bb. 1 1098 polypeptide. 
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All of the compositions and methods disclosed and claimed herein can be 
made and executed without undue experimentation in light of the present disclosure. 
While the compositions and methods of this invention have been described in terms of 
5 preferred embodiments, it will be apparent to those of skill in the art that variations 
may be applied to the compositions and methods and in the steps or in the sequence of 
steps of the method described herein without departing from the concept, spirit and 
scope of the invention. More specifically, it will be apparent that certain agents which 
are both chemically and physiologically related may be substituted for the agents de- 
10 scribed herein while the same or similar results would be achieved. All such similar 
substitutes and modifications apparent to those skilled in the art are deemed to be 
within the spirit, scope and concept of the invention as defined by the appended 
claims. 



15 
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Claims: 

1. An isolated B. thuringiensis Cry3Bb polypeptide modified to have improved 
insecticidal activity or enhanced insecticidal specificity against a target insect, 
5 said polypeptide comprising at least one amino acid substitution, one amino acid 

addition, or one amino acid deletion in the primary sequence of the native or 
unmodified Cry3Bb polypeptide, wherein said substitution or deletion occurs at 
a position corresponding to from about amino acid 1 to about amino acid 365 of 
the unmodified polypeptide's amino acid sequence. 

10 



2. The polypeptide of claim 1, wherein Aspl03 is replaced by glutamic acid; 
Alal04 is deleted; Thrl54 is replaced by glycine or phenylalanine; Prol55 is 
replaced by histidine; Leul56 is replaced byhistidine; Leul58 is replaced by 

15 arginine; Serl60 is replaced by asparagine; Lysl61 is replaced by proline; 

Pro 162 is replaced by histidine; Asp 165 is replaced by glycine; Lysl89 is 
replaced by glycine; Ser223 is replaced by proline; Tyr230 is replaced by leucine 
or serine; His231 is replaced by argiriine, asparagine, serine, or threonine; 
Thr241 is replaced by serine; Tyr287 is replaced by phenylalanine; Asp288 is 

20 replaced by asparagine; Ile289 is replaced by threonine or valine; Arg290 is 

replaced by asparagine, leucine or valine; Leu291 is replaced by arginine; 
Tyr292 is replaced by phenylalanine; Ser293 is replaced by arginine or proline; 
Phe305 is replaced by serine; Ser31 1 is replaced by alanine, isoleucine, leucine, 
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or threonine; Leu312 is replaced by proline or valine; Asn313 is replaced by 
arginine, histidine, threonine or valine; Thr314 is replaced by asparagine; 
Leu315 is replaced by proline; Gln316 is replaced by aspartic acid, leucine, 
methionine, or tryptophan; Glu317 is replaced by alanine, asparagine, lysine or 
5 valine; Tyr318 is replaced by cysteine; Gln348 is replaced by arginine; or 

Val365 is replaced by alanine. 



3. The polypeptide of claim 1 or 2, wherein Thrl54 is replaced by phenylalanine, 
10 Prol 55 is replaced by histidine, Leul 56 is replaced by histidine, and Leul 58 is 

replaced by arginine. 



4. The polypeptide of claim 1 or 2, wherein Tyr230 is replaced by leucine, and 
1 5 His23 1 is replaced by serine. 



5. The polypeptide of claim 1 or 2, wherein Ser223 is replaced by proline, and 
Tyr230 is replaced by serine. 

20 



6. 



The polypeptide of claim 1 or 2, wherein His23 1 is replaced by arginine. 
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7. The polypeptide of claim 1 or 2, wherein His23 1 is replaced by asparagine, and 
"Thr241 is replaced by serine. 

5 

8. The polypeptide of claim 1 or 2. wherein His23 1 is replaced by threonine. 

9. The polypeptide of claim 1 or 2, wherein Arg290 is replaced by asparagine. 

10 

10. The polypeptide of claim 1 or 2, wherein Ser3 1 1 is replaced by leucine, Asn3 1 3 
is replaced by threonine, and Glu3 1 7 is replaced by lysine. 

15 

11. The polypeptide of claim 1 or 2, wherein Ser311 is replaced by threonine, 
Glu3 1 7 is replaced by lysine, and Tyr3 1 8 is replaced by cysteine. 



20 12. 



The polypeptide of claim 1 or 2, wherein Ser3 1 1 is replaced by alanine, Leu312 
is replaced by valine, and Gln3 1 6 is replaced by tryptophan. 
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1 3 . The polypeptide of claim 1 or 2, wherein His23 1 is replaced by arginine, Scr3 1 1 
is replaced by leucine, Asn3 13 is replaced by threonine, and Glu3 1 7 is replaced 
by lysine. 



5 

14. The polypeptide of claim 1 or 2, wherein Ser311 is replaced by threonine, 
Leu312 is replaced by proline, Asn313 is replaced by threonine, and Glu317 is 
replaced by asparagine. 



10 

15. The polypeptide of claim 1 or 2, wherein Ser31 1 is replaced by alanine, and 
Gln3 16 is replaced by aspartic acid. 



15 16. The polypeptide of claim 1 or 2, wherein Ile289 is replaced by threonine, 
Leu291 is replaced by arginine, Tyr292 is replaced by phenylalanine, and 
Ser293 is replaced by arginine. 



20 17. 



The polypeptide of claim 1 or 2, wherein His231 is replaced by arginine, and 
Ser3 1 1 is replaced by leucine. 
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1 8. The polypeptide of claim 1 or 2, wherein Ser3 1 1 is replaced by isoleucine. 



19. The polypeptide of claim 1 or 2, wherein Ser31 1 is replaced by isoleucine, and 
5 Asn3 1 3 is replaced by histidine. 

20. The polypeptide of claim 1 or 2, wherein Asn313 is replaced by valine, Thr314 
is replaced by asparagine, Gln316 is replaced by methionine, and Glu317 is 

10 replaced by valine. 

21. The polypeptide of claim 1 or 2, wherein Asn313 is replaced by arginine, 
Leu315 is replaced by proline, Gln316 is replaced by leucine, and Glu317 is 

1 5 replaced by alanine. 

22. The polypeptide of claim 1 or 2, wherein Tyr287 is replaced by phenylalanine, 
Asp288 is replaced by asparagine, and Arg290 is replaced by leucine. 

20 



23. 



The polypeptide of claim 1 or 2, wherein Arg290 is replaced by valine. 



36 \ 
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24. The polypeptide of claim 1 or 2, wherein Asp 1 65 is replaced by glycine. 



5 25. The polypeptide of claim 1 or 2, wherein Serl60 is replaced by asparagine, 
Lysl61 is replaced by proline, Pro 162 is replaced by histidine, and Thrl54 is 
replaced by glycine. 



10 26. The polypeptide of claim I or 2, wherein Ile289 is replaced by valine, and 
Ser293 is replaced by proline. 



27. The polypeptide of claim 1 or 2, wherein Serl60 is replaced by asparagine, 
15 Lysl61 is replaced by proline, Pro 162 is replaced by histidine, Asp 165 is 

replaced by glycine, Ile289 is replaced by valine, and Ser293 is replaced by 
proline. 



20 28. 



The polypeptide of claim 1 or 2, wherein Aspl03 is replaced by glutamic acid, 
and Ala 104 is deleted. 
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29. The polypeptide of claim 1 or 2, wherein Lysl 89 is replaced by glycine. 



30. The polypeptide of claim 1 or 2, wherein Asp 103 is replaced by glutamic acid, 
5 Alal04 is deleted, Serl60 is replaced by asparagine, Lysl 61 is replaced by 

proline, Prol 62 is replaced by histidine, and Asp 1 65 is replaced by glycine. 

31. The polypeptide of claim 1 or 2, wherein Asp 103 is replaced by glutamic acid, 
10 Alal04 is deleted, Thrl54 is replaced by phenylalanine, Pro 155 is replaced by 

histidine, Leul 56 is replaced by histidine, and Leul 58 is replaced by arginine. 

32. The polypeptide of claim 1 or 2, wherein Asp 1 65 is replaced by glycine, Ser3 1 1 
1 5 is replaced by threonine, and Glu3 1 7 is replaced by lysine. 

33. The polypeptide of claim 1 or 2, wherein Asp 1 65 is replaced by glycine, Ile289 
is replaced by valine, Ser293 is replaced by proline, Phe305 is replaced by 

20 serine, Ser31 1 is replaced by alanine, Leu312 is replaced by valine, Gln316 is 

replaced by tryptophan, Gln348 is replaced by arginine, and Val365 is replaced 
by alanine. 
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34. The polypeptide of claim 1 or 2, wherein Ile289 is replaced by valine, Ser293 is 
replaced by proline, and Gln348 is replaced by arginine. 

5 

35. The polypeptide of claim 1 or 2, wherein Aspl65 is replaced by glycine, and 
Ser31 1 is replaced by leucine. 



10 36. The polypeptide of claim 1 or 2, wherein the first 159 amino acids are deleted. 



37. The polypeptide of claim 1 or 2, wherein Gln348 is replaced by arginine. 



15 

38, The polypeptide of claim 1 or 2, wherein Aspl65 is replaced by glycine, 
His231 is replaced by arginine, Ser311 is replaced by leucine, Asn313 is 
replaced by threonine, and Glu317 is replaced by lysine. 



20 

39. The polypeptide of any preceding claim, wherein said polypeptide comprises an 
amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ 
ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ 
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ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, 
SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID 
NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ 
ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 
5 SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 

NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ 
ID NO:70, SEQ ID NO: 100 and SEQ ID NO: 108. 



10 40. The polypeptide of any preceding claim, wherein said pol ypeptide is encoded by 
a contiguous nucleic acid sequence selected from the group consisting of SEQ 
ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID 
NO: 1 1, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ 
ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, 

15 SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID 

NO:39, SEQ ID NO:41 , SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ 
ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, 
SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID 
NO:67, SEQ ID NO:69, SEQ ID NO:99 and SEQ ID NO: 107. 

20 

41. A composition comprising an insecticidally-effective amount of the Cry3Bb 
polypeptide of claim 1 . 
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42. The composition of claim 41, comprising from about 0.5% to about 99% by 
weight of the polypeptide of claim 1 . 



43. The composition of claim 41 or 42, wherein said polypeptide comprises an 
amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ 
ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ 

10 ID NO:14, SEQ ID NO: 16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, 

SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID 
NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ 
ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 
SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 

15 NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ 

ID NO:70, SEQ ID NO:72, SEQ ID NO: 100, and SEQ ID NO: 1 08. 



44. The composition of any one of claims 41 to 43, wherein said polypeptide is 
20 encoded by a nucleic acid sequence having the sequence of SEQ ID NO: I, 

SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:ll, 
SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID 
NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ 



WO 99/31248 PCT/US98/26852 

224 

ID N0:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, 
SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID 
NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ 
ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, 
5 SEQ ID NO:69, SEQ ID NO:71, SEQ ID NO:99, or SEQ ID NO: 107. 



45. The composition of any one of claims 4 1 to 44, prepared by a process compris- 
ing the steps of: 

10 

(a) culturing a Bacillus thuringiensis NRRL B-21744, NRRL B-21745, 
NRRL B-21746, NRRL B-21747, NRRL B-21748, NRRL B-21749, NRRL B- 
21750, NRRL B-21751, NRRL B-21752, NRRL B-21753, NRRL B-21754, 
NRRL B-21755, NRRL B-21756, NRRL B-21757,NRRL B-21758, NRRL B- 

15 21759, NRRL B-21760, NRRL B-21761, NRRL B-21762, NRRL B-21763, 

NRRL B-21764, NRRL B-21765, NRRL B-21766, NRRL B-21767, NRRL B- 
21768, NRRL B-21769, NRRL B-21770, NRRL B-21771, NRRL B-21772, 
NRRL B-21773, NRRL B-21774, NRRL B-21775, NRRL B-21776, NRRL B- 
21777, NRRL B-21778, NRRL B-21779, or EG11098 cell under conditions 

20 effective to produce an insecticidal polypeptide; and 



(b) obtaining said insecticidal polypeptide from said cell. 
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46. The composition of any one of claims 41 to 45, comprising a Bacillus 
thuringiensis NRRL B-21744, NRRL B-21745, NRRL B-21746, NRRL B- 
21747, NRRL B-21748, NRRL B-21749, NRRL B-21750, NRRL B-21751, 
5 NRRL B-21752, NRRL B-21753, NRRL B-21754, NRRL B-21755, NRRL B- 

21756, NRRL B-21757, NRRL B-21758, NRRL B-21759, NRRL B-21760, 
NRRL B-21761, NRRL B-2 1762, NRRL B-2 1763, NRRL B-2 1764, NRRL B- 
21765, NRRL B-2 1766, NRRL B-2 1767, NRRL B-2 1768, NRRL B-2 1769, 
NRRL B-21770, NRRL B-21771, NRRL B-21772,NRRL B-21773, NRRL B- 
10 21774, NRRL B-21775, NRRL B-21776, NRRL B-21777, NRRL B-21778, or 

NRRL B-21779 cell. 



47. The composition of any one of claims 41 to 46, wherein said composition 
15 comprises a cell extract, cell suspension, protein fraction, crystal fraction, cell 

culture, cell homogenate, cell lysate, cell supernatant, cell filtrate, or cell pellet 
of a Bacillus thuringiensis NRRL B-21744, NRRL B-21745, NRRL B-21746, 
NRRL B-2 1747, NRRL B-21748, NRRL B-2 1749, NRRL B-2 1750, NRRL B- 
21751, NRRL B-21752, NRRL B-21753, NRRL B-21754, NRRL B-21755, 
20 NRRL B-21756, NRRL B-21757, NRRL B-21758, NRRL B-21759, NRRL B- 

21760, NRRL B-21761, NRRL B-21762, NRRL B-21763, NRRL B-21764, 
NRRL B-21 765, NRRL B-2 1 766, NRRL B-2 1 767, NRRL B-2 1 768, NRRL B- 
21769, NRRL B-21770, NRRL B-21771, NRRL B-21772, NRRL B-21773, 
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NRRL B-21774, NRRL B-21775, NRRL B-21776, NRRL B-21777, NRRL B- 
21778, NRRL B-21779, or an EG 1 1098 cell. 

5 48. The composition of any one of claims 41 to 47, formulated as a powder, gran- 
ule, spray, emulsion, colloid, or solution. 

49. The composition of any one of claims 41 to 48, wherein said composition is 
10 prepared by desiccation, lyophilization, homogenization, freeze drying, emul- 

sification, evaporation, separation, extraction, filtration, centrifugation, sedi- 
mentation, dilution, crystallization, or concentration. 

15 50. A polynucleotide comprising an isolated sequence region that encodes the 
polypeptide of any one of claims 1 to 40. 

51. The polynucleotide of claim 50, comprising an isolated sequence region that 
20 encodes a polypeptide that comprises an amino acid sequence selected from the 

group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID 
NO:8, SEQ ID NO:10, SEQ ID NO: 12, SEQ ID NO:14, SEQ ID NO:16, SEQ 
ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, 
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SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID 
NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ 
ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, 
SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID 
5 NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:72, SEQ 

ID NO: 1 00, and SEQ ID NO: 1 08. 

52. The polynucleotide of claim 50 or 51 , comprising a contiguous nucleotide se- 
10 quence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:3, 

SEQ ID NO:5. SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l 1, SEQ ID NO: 13. 
SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID 
NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID N0 29, SEQ ID NO:31, SEQ 
ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, 
15 SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID 

NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59. SEQ 
ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, 
SEQ IDNO:99, SEQ ID NO:99, and SEQ ID NO: 107. 

20 

53. The polynucleotide of any one of claims 50 to 53, characterized as DNA, 
cDNA, rRNA. or mRNA. 
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54. The polynucleotide of any one of claims 50 to 53, wherein said polynucleotide 
is from about 2000 to about 10,000 nucleotides in length. 



5 



55. The polynucleotide of any one of claims 50 to 54, wherein said nucleic acid 
segment is from about 3000 to about 8,000 nucleotides in length. 



10 56. The polynucleotide of any one of claims 50 to 55, wherein said isolated 
sequence region is operably linked to a promoter, said promoter expressing said 
sequence region. 



15 57. The polynucleotide of any one of claims 50 to 56, wherein said isolated 
sequence region is operably linked to a heterologous promoter. 



58. 

20 



The polynucleotide of any one of claims 50 to 57, wherein said isolated 
sequence region is operably linked to a plant-expressible promoter. 
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59. The polynucleotide of any one of claims 50 to 58, wherein said isolated 
sequence region is operably linked to a constitutive, inducible, or tissue-specific 
promoter. 



5 



60. A vector comprising the polynucleotide of any one of claims 50 to 59, or a 
polynucleotide that encodes the polypeptide of any one of claims 1 to 40. 



10 61. The vector of claim 60, defined as a plasmid, a cosmid, a phagemid, a phage, a 
virus, or a baculovirus. 



62. The vector of claim 60 or 61, transformed and replicated in a prokaryotic or 
15 eukaryotic host. 



63. A virus comprising the polypeptide of any one of claims 1 to 40, or the 
polynucleotide of any one of claims 50 to 59. 



20 
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64. A transformed host cell comprising the polypeptide of any one of claims 1 to 40, 
the polynucleotide of any one of claims 50 to 59, the vector of any one of claims 
60 to 62, or the virus of claim 63. 



5 



65. The transformed host cell of claim 64, further defined as a prokaryotic or 
eukaryoticcell. 



10 66. The transformed host cell of claim 64 or 65, wherein said prokaryotic cell is a 
eubacterial, archaebacterial or cyanobacterial cell, or wherein said eukaryotic 
cell is an animal, fungal, or plant cell. 

1 5 67. The transformed host cell of any one of claims 64 to 66, wherein said cell is an 
E. coli, B. thuringiensis, A tumefaciens, B. suhtilis, B. megaterium, B. cereus, 
Salmonella spp., or Pseudomonas spp. cell. 

20 68. The transformed host cell of any one of claims 64 to 67, wherein said cell is 
selected from the group consisting of B. thuringiensis NRRL B-21744, NRRL 
B-21745, NRRL B-21746, NRRL B-21747, NRRL B-21748, NRRL B-21749, 
NRRL B-21750, NRRL B-21751, NRRL B-21 752, NRRL B-21753, NRRL B- 
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21754, NRRL B-21755, NRRL B-21756, NRRL B-21757, NRRL B-21758, 
NRRL B-2 1759, NRRL B-21760, NRRL B-21761, NRRL B-2 1762, NRRL B- 
21763, NRRL B-21764, NRRL B-21765, NRRL B-21766, NRRL B-2I767, 
NRRL B-21768, NRRL B-21 769, NRRL B-21770, NRRL B-21771, NRRL B- 
5 21772, NRRL B-21773, NRRL B-21774, NRRL B-21775, NRRL B-21776, 

NRRL B-21777, NRRL B-21778, and NRRL B-21779. 



69. The transformed host cell of claim 66, wherein said plant cell is a grain, tree, 
10 legume, fiber, vegetable, fruit, berry, nut, citrus, grass, cactus, succulent, or 

ornamental plant cell. 



70. The transformed host cell of claim 69, wherein said plant cell is a corn, rice, 
15 tobacco, alfalfa, soybean, sorghum, potato, tomato, flax, canola, sunflower, 

cotton, flax, kapok, wheat, oat, barley, or rye cell. 



71. The transformed host cell of any one of claims 64 to 70, wherein said 
20 polynucleotide is introduced into said cell by a vector, virus, cosmid, phagemid, 

phage, plasmid, or by electroporation, transformation, conjugation, 
microprojectile bombardment, direct DNA injection, naked DNA transfer, 
transformation, or transfection. 
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72. A transgenic plant comprising the polypeptide of any one of claims 1 to 40, 
the polynucleotide of any one of claims 50 to 59, the vector of any one of 
claims 60 to 62, the virus of claim 63, or the host cell of any one of claims 64 
to71. 



73 The transgenic plant of claim 72, having incorporated into its genome a selected 
1 0 polynucleotide that encodes the polypeptide of any one of claims 1 to 40. 



74. The transgenic plant of claim 72 or 73, wherein said polypeptide comprises an 
amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ 

15 ID NO:4, SEQ ID NO:6. SEQ ID NO:8, SEQ ID NO:I0, SEQ ID NO:12, SEQ 

ID NO: 14. SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, 
SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID 
NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ 
ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, 

20 SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID 

NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ 
ID NO:70, SEQ ID NO: 1 00, and SEQ ID NO: 108. 
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75. The transgenic plant of any one of claims 72 to 74, wherein said plant is a grain, 
tree, legume, fiber, vegetable, fruit, berry, nut, citrus, grass, cactus, succulent, or 
ornamental plant. 



76. The transgenic plant of any one of claims 62 to 75, wherein said plant is a corn, 
rice, tobacco, alfalfa, soybean, sorghum, potato, tomato, flax, canola ? sunflower, 
cotton, flax, kapok, wheat, oat, barley, or rye plant. 



77. A progeny of any generation of the transgenic plant of any one of claims 72 to 
76. 



78. A seed of any generation of the transgenic plant of any one of claims 72 to 76. 



79. A seed of any generation of the progeny of claim 77. 



80. A plant grown from the seed of claim 78 or or 79. 
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81. A method of killing a coleopteran insect, said method comprising the step of 
contacting said insect with an insecticidally-effective amount of the polypep- 
tide of claim 1. 

5 

82. A method of controlling a coleopteran insect population, said method compris- 
ing the step of providing to the environment of said insect population, an in- 
secticidally-effective amount of the polypeptide of claim 1 . 

10 

83. The method of claim 81 or 82, wherein said polypeptide is obtained from a 
cell extract, cell suspension, protein fraction, crystal fraction, cell culture, cell 
homogenate, cell lysate, cell supernatant, cell filtrate, or cell pellet of a Bacil- 

15 lus thuringiensis NRRL B-21744, NRRL B-21745, NRRL B-21746, NRRL B- 

21747, NRRL B-21748, NRRL B-21749, NRRL B-21750, NRRL B-21751, 
NRRL B-21752, NRRL B-2 1753, NRRL B-21754, NRRL B-21755, NRRL B- 
21756, NRRL B-21757, NRRL B-21758, NRRL B-21759, NRRL B-21760, 
NRRL B-2 1761, NRRL B-2 1762, NRRL B-21763, NRRL B-21764, NRRL B- 

20 21765, NRRL B-21766, NRRL B-21767, NRRL B-21768, NRRL B-21769, 

NRRL B-2 1 770, NRRL B-2 1 77 1 , NRRL B-2 1 772, NRRL B-2 1 773, NRRL B- 
21774, NRRL B-2 1775, NRRL B-2 1 776, NRRL B-2 1777, NRRL B-2 1778, or 
NRRL B-2 1779 cell. 
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84. The method of any one of claims 81 to 83, wherein said polypeptide is pro- 
vided to said environment by spraying, dusting, sprinkling, soaking, aerating, 
5 misting, atomizing, soil injection, soil tilling, seed coating, or seedling coating. 



85. The method of any one of claims 81 to 84, wherein said polypeptide is formu- 
lated as a powder, granule, spray, emulsion, colloid, or solution. 

10 



86. The method of any one of claims 81 to 85, wherein said polypeptide is pre- 
pared by desiccation, lyophilization, homogenization, freeze drying, emulsifi- 
cation, evaporation, separation, extraction, filtration, centrifiigation, sedimen- 
15 tation, dilution, crystallization, or concentration. 



87. A method of preparing a Coleopteran-resistanttransgenic plant, comprising the 
steps of: 

20 

(a) transforming a plant cell with a polynucleotide comprising a selected 
sequence region that encodes the polypeptide of claim 1 , wherein said sequence 
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region is operably linked to a promoter which expresses said sequence region; 
and 



5 (b) generating from said plant cell a transgenic plant that comprises said 

selected sequence region and that expresses said polypeptide. 



88. The method of claim 87, wherein said sequence region encodes a polypeptide 
10 that comprises an amino acid sequence selected from the group consisting of 

SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6. SEQ ID NO:8, SEQ ID NO: 10, 
SEQ ID NO:12, SEQ ID NO: 14. SEQ ID NO:16, SEQ ID NO: 18, SEQ ID 
NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ 
ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, 
15 SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ. ID NO:46, SEQ ID 

NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ 
ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, 
SEQ ID NO :68, SEQ ID NO:70, SEQ ID NO: 1 00, and SEQ ID NO: 1 08. 

20 

89. A method of killing a Coleopteran insect, comprising feeding to said insect a 
plant cell transformed with a polynucleotide that encodes an amino acid 
sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, 
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SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, 
SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID 
NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ 
ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, 
5 SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID 

NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ 
ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, 
SEQ ID NO: 100, and SEQ ID NO: 108. 

0 

90. The method of claim 89, wherein said insect is killed by ingesting a portion of a 
transgenic plant that comprises said transformed cell. 



15 91 . A method of preparing a plant seed resistant to Coleopteran insect attack, said 
method comprising the steps of: 

(a) transforming a plant cell with a nucleic acid segment comprising a 
polynucleotide that encodes an amino acid sequence selected from the group 
20 consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ 

ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, 
SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID 
NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ 
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ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44 ? SEQ ID NO:46, 
SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID 
NO:56, SEQ ID NO:58 5 SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ 
ID NO:66 } SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO: 100, and SEQ ID 
5 NO: 1 08 to produce a transformed plant cell; 

(b) growing said transformed plant cell under conditions effective to 
produce a transgenic plant from said cell; and 

10 (c) obtaining from said transgenic plant, a seed resistant to attack by said 

Coleopteran insect. 



92. The method of claim 9 1 , wherein step (a) comprises transforming said plant cell 
1 5 by electroporation, transfection, naked DN A uptake, protoplast generation, direct 

transfer of DNA into pollen, embryo or pluripotent plant cell, Agrobacterium- 
mediated transformation, particle bombardment, or microprojectile 
bombardment. 



20 



93. 



The method of claim 91 or 92, wherein step (b) comprises generation of pluripo- 
tent plant cells from said transformed plant cell. 
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94. A method for producing a modified Cry3Bb polypeptide having improved in- 
secticidal activity or specificity, comprising: 

5 (a) obtaining a high-resolution three-dimensional crystal structure of said 

polypeptide; 



(b) locating in said crystal structure of said polypeptide one or more re- 
gions of bound water, wherein said bound water forms a contiguous hydrated 
10 surfaces separated by no more than about 16A; 



(c) increasing the hydrophobicity of one or more amino acids of said 
polypeptide in said region; and 

1 5 (d) obtaining the modified Cry3Bb polypeptide so produced. 



95. A method for producing a modified Cry3Bb polypeptide having improved in- 
secticidal activity, or enhanced insecticidal specificity, comprising 

20 

(a) obtaining a high-resolution three-dimensional crystal structure of said 
polypeptide; 
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(b) identifying a loop region in said polypeptide; 

(c) modifying one or more amino acids in said loop region to increase the 
hydrophobicity of one or more of said amino acids; and 

5 

(d) obtaining the modified Cry3Bb polypeptide so produced. 



96. A method for increasing the mobility of channel forming helices of a Cry3Bb 
10 polypeptide, comprising disrupting one or more hydrogen bonds formed be- 

tween a first amino acid of one or more of said channel forming helices and a 
second amino acid of said polypeptide. 



15 97. The method of claim 96, wherein said hydrogen bonds are formed inter- or 
intramolecularly. 



98. The method of claim 96, wherein said disrupting comprises replacing said first 
20 amino acid or said second amino acid with a third amino acid whose spatial 

distance bond angle is greater than about 3A, or whose spatial orientation is 
not equal to 1 80 ±60 degrees relative to the hydrogen bonding site of said first 
or said second amino acid. 
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99. A method for increasing the flexibility of a loop region in a channel forming 
domain of a Cry3Bb polypeptide, comprising: 

(a) obtaining a crystal structure of a Cry3Bb polypeptide having one or 
more loop regions between adjacent cc -helices; 

(b) identifying the amino acids comprising said loop region; and 

(c) altering one or more of said amino acids in said loop region to reduce 
the steric hindrance in said region, wherein said altering increases flexibility of 
said loop region in said polypeptide. 

100. A method of increasing the insecticidal activity of a Cry3Bb polypeptide, 
comprising reducing or eliminating binding of said polypeptide to a carbohy- 
drate in a target insect gut. 

101. The method of claim 100, wherein said reducing or eliminating is accom- 
plished by removal of one or more a-helices of domain 1 of said polypeptide. 
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102. The method claim 100, wherein said reducing or eliminating is accomplished 
by removal of a-helices al , a2a/b, or ct3. 

103. The method of claim 102, wherein said reducing or eliminating is accom- 
plished by replacing one or more amino acids within loop pl,a8, with one or 
more amino acids having increased hydrophobicity. 

104. The method of claim 103, wherein said reducing or eliminating is accom- 
plished by replacing with any other amino acid, one or more amino acids se- 
lected from the group consisting of threonine 154, proline 155, leucine 156, 
and leucine 158. 

105. A method of preparing a modified Cry3Bb polypeptide having improved in- 
secticidal activity or enhanced insecticidal specificity when compared to an 
unmodified Cry3Bb polypeptide, said method comprising the steps of: 

(a) obtaining a crystal structure of said polypeptide; 
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(b) identifying from said crystal structure one or more surface-exposed 
amino acids in said polypeptide; 



(c) randomly substituting one or more of said surface-exposed amino acids 
5 to obtain a plurality of mutated polypeptides, wherein at least 50% of said 

mutated polypeptides have diminished insecticidal activity, or reduced insec- 
ticidal specificity; 

(d) identifying from said plurality of mutated polypeptides a region of said 
1 0 Cry3Bb polypeptide for targeted mutagenesis; and 

(e) mutagenizing said region to obtain said Cry3Bb polypeptide having 
improved insecticidal activity or enhanced insecticide 1 specificity. 



15 



106. The method of claim 105, further comprising determining the amino acid se- 
quences of a plurality of mutated polypeptides having diminished activity or 
reduced insecticidal specificity, and identifying one or more amino acid resi- 
dues required for said activity or specificity. 



20 



107. A method for producing a Cry3Bb polypeptide having improved insecticidal 
activity, comprising: 



WO 99/31248 



PCT/US98/26852 



244 

(a) obtaining a high-resolution three-dimensional crystal structure of said 
polypeptide; 

5 (b) determining the electrostatic surface distribution of said polypeptide; 

(c) identifying one or more regions of high electrostatic diversity; 

(d) modifying the electrostatic diversity of said region by altering one or 
10 more amino acids in said region; and 

(e) obtaining said Cry3Bb polypeptide having improved insecticidal ac- 
tivity. 

15 

108. The method of claim 107, wherein said electrostatic diversity is decreased 
relative to the electrostatic diversity of a native Cry3Bb polypeptide. 



20 109. The method of claim 107, wherein said electrostatic diversity is increased 
relative to the electrostatic diversity of a native Cry3Bb polypeptide. 
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110. A method of producing a Cry3Bb polypeptide having improved insecticidal 
activity, comprising: 

(a) obtaining a high-resolution three-dimensional crystal structure; 

(b) identifying the presence of one or more metal binding sites in said 
polypeptide; 

(c) altering one or more amino acids in said binding site; and 

(d) obtaining an altered polypeptide, wherein said polypeptide has im- 
proved insecticidal activity. 

111. The method of claim 1 1 0, wherein said altering eliminates metal binding. 

112. A method of identifying a Cry3Bb polypeptide having improved channel ac- 
tivity, comprising: 

(a) obtaining a Cry3Bb polypeptide suspected of having improved channel 
activity; 
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(b) determining one or more of the following characteristics in said 
polypeptide, and in a wild-type polypeptide: the rate of channel formation, the 
rate of growth of channel conductance or the duration of open channel state; 

5 (c) comparing said characteristics of said mutant and said wild-type; and 



(d) identifying said polypeptide having an increased rate of channel for- 
mation compared to said wildtype polypeptide. 



10 

113. A method for producing a modified Cry3Bb polypeptide, having improved 
insecticidal activity, comprising altering one or more non-surface amino acids 
located at or near the point of greatest convergence of two or more loop re- 
gions of said Cry3Bb polypeptide, wherein said altering decreases the mobility 
15 of one or more of said loop regions. 



20 



114. 



The method of claim 113, wherein said mobility is determined by comparing 
the thermal denaturation of said modified protein to a wild-type Cry3Bb 
polypeptide. 



WO 99/31248 PCT/US98/26852 

247 

115. A method of improving the insecticidal activity of a Cry3 polypeptide, said 
method comprising inserting one or more protease sensitive sites into one or 
more loop regions of domain 1 of said polypeptide. 



116. 



The method of claim 1 1 5, wherein said loop region is ot3,4. 
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Sheet 1 



p Strand Amino Acid Residue 
P2 339-350 
P3a 256-360 
p3b 362-368 
P4 375-379 
P5 390-395 
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ALIGNMENT OF CRY3 SEQUENCES 

(Numbered according to Cry3BB) 
(alpha helices underlined, beta sheets marked with +++'s) 

v H 1 10 20 30 40 

CRY3C: MNPNNRSEHDTIKATENNEVSNNHAQYPLADTP--TLEELNY 

CRYCBB2: MNPNNRSEHDTIKVTPNSELPTNHNQYPLADNPNSTLEELNY 

CRY3BB: MNPNNRSEHDTIKVTPNSELQTNHNQYPLADNPNSTLEELNY 

CRY3BA: MIRMGGRKMNPNNRSEYDTIKVTPNSELPTNHNQYPLADNPNSTLEELNY 

CRY3A: MIRKGGRKMNPNNRSEHDTIKTTENNEVPTNHVQYPLAETPNPTLEDLNY 



50 60 70 80 90 

CRY3C : KEFLRRTTDNNVEALDSSTTKDAI QKG I S I IGDLLGVVGFPYGGALVSFY 
CRYCBB2: KEFLRMTEDSSTEVLDNSTVKDAVGTG I SVVGQI LGVVGVPFAGALTSFY 
CRY3BB : KEFLRMTEDSSTEVLDNSTVKDAVGTGISVVGQILGVVGVPFAGALTSFY 
CRY3BA: KEFLRMTADNSTEVLDSSTVKDAVGTG I SVVGQI LGVVGV PFAGALTSFY 
CRY3A: KEFLRMTADNNTEALDSSTT KDV I QKG I S VVGDLLGV VG F PFGGAL VSFY 

100 110 120 130 140 

CRY3C: TNLLNTIWPGE-DPLKAFMQQVEALIDQKIADYAKDKATAELQGLKNVFK 
CRY3BB2: QSFLDTIWPSDADPWKAFMAQVEVLIDKKIEEYAKSKALAELQGLQNNFE 
CRY3BB : QSFLNTIWPSDADPWKAFMAQVEVLIDKKIEEYAKSKALAELQGLQNNFE 
CRY3BA : QSFLNAIWPSDADPWKAFMAQVEVLIDKKI EEYAKSKALAELQGLQNNFE 
CRY3A: TNFLNTIWPSE-DPWKAFMEQVEALMDQKIADYAKNKALAELQGLQNNVE 



150 160 170 180 190 

CRY3C: DYVSALDSWDKTPLTLRDGRSQGRIRELFSQAESHFRRSMPSFAVSGYEV 
CRY3BB2: DYVNALNSWKKTPLSLRSKRSQDRIRELFSQAESHFRNSMPSFAVSKFEV 
CRY3BB : DYVNALNSWKKTPLSLRSKR5QDRIRELFSQAESHFRNSMPSFAVSKFEV 
CRY3BA: DYVNALDSWKKAPVNLRSRRSQDRIRELFSQAESHFRNSMPSFAVSKFEV 
CRY3A: DY VSALSSWQKN PVSSRNPHSQGR I RELFSQAESH FRNSMPSFAI SGYE V 



FIG.17A 
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200 210 220 230 240 

CRY3C: LFLPTYAQAANTHLLLLKDAQIYGTDWGYSTDDLNEFHTKQKDLT1EYTN 
CRY3BB2: LFLPTYAQAANTHLLLLKDAQVFGEEWGYSSEDVAEFYHRQLKLTQQYTD 
CRY3BB: LFLPTYAQAANTHLLLLKDAQVFGEEWGYSSEDVAEFYHRQLKLTQQYTD 
CRY3BA: LFLPTYAQAANTHLLLLKDAQVFGEEWGYSSEDIAEFYQRQLKLTQQYTO 
CRY3A: LF LTTYAQAANTHLFLLKDAQIYG EEWGYE KEDIAEFYKRQLKLTQEYTD 

250 260 270 280 290 

CRY3C: HCAKWYKAGLDKLRGSTYEEWVKFNRYRREMTLTVLDLITLFPLYDVRTY 
CRYS3BB2: HCVNWYNVGLNGLRGSTYDAWVKFNRFRREMTLTVLDLIVLFPFYDVRLY 
CRY3BB: HCVNWYNVGLNGLRGSTYDAWVKFNRFRREMTLTVLDLIVLFPFYDIRLY 
CRY3BA: HCVNWYNVGLNSLRGSTYDAWVKFNRFRREMTLTVLDLIVLFPFYDVRLY 
CRY3A: HCVKMYNVGLD KLRGSS YESVJVNFNRYRREMTLTVLDLIALFPL Yours 
truly. DVRLY 

300 310 320 330 340 

CRY3C: TKGVKTELTRDVLTDPIVAVNNMNGYGTTFSNIENYIRKPHLFDYLHAIQ 
CRY3BB2: SKGVKTELTRDI FTDP I FSLNTLQEYGPTFLS I ENS IRKPHLFDYLQG I E 
CRY3BB: SKGVKTELTRDI FTDP I FSLNTLQEYGPTFLS I ENS IRKPHLFDYLQG IE 
CRY3BA: SKGVKTELTRDI FTDP I FTLNALQEYGPTFSS I ENS I RKPHLFDYLRG I E 
CRY3A: PKEVKTELTRDVLTDP I VGVNNLRGYGnFSNIENY I RKPHLFDYLHR I Q 



350 360 370 380 390 

CRY3C: FHSRLQPGYFGTDSFNYWSGNYVSTRSSIGSDEIIRSPFYGNKSTLDVQN 
CRY3BB2: FHTRLQPGYSGKDSFNYWSGNYVETRPSIGSSKTITSPFYGDKSTEPVQK 
CRY3BB: FHTRLQPGYFGKDSFNYWSGNYVETRPSIGSSKTITSPFYGDKSTEPVQK 
CRY3BA: FHTRLRPGYSGKDSFNYWSGNYVETRPSIGSNDTITSPFYGDKSIEPIQK 
CRY3A: FHTRFQPGYYGNDSFNYWSGNYVSTRPSIGSNDIITSPFYGNKSSEPVQN 



fH-H-H- 



+++++ +-H-H-++ 



+++++ 



400 410 420 430 

CRY3C: LEFNGEKVFRAVANGNLAVWPVGTGGTKIHSGVTKVQFSQYNDRKDEVRT 
CRY3BB2 : LSFDGQKVYRTI ANTDVAAWPNG- - - -KI YFGVTKVDFSQYDDQKNETST 
CRY3BB : LSFDGQKVYRTIANTDVAAWPNG- - - - KV YLGVTKVDFSQYDDQKNETST 
CRY3BA: LSFDGQKVYRTIANTDIAAFPDG- - - - KI YFGVTKVDFSQYDDQKNETST 
CRY3A : LEFNGEKVYRAVANTNLAVWPSA V YSGVTKVEFSQYNDQTDEAST 



++++ ++++++++ 



++++ 
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440 450 460 470 480 

CRY3C: QTYDSKRNVGG I V - FDS I DQLPP ITTDESLEKAYSHQLNYVRCFLLQGGR 
CRY3BB2 : QTYDSKRNNGHVGAQDS I DQLPPETTDEPLEKAYSHQLNYAECFLMQDRR 
CRY3BB: QTYDSKRNNGHVSAQDSIDQLPPETTDEPLEKAYSHQLNYAECFLMQDRR 
CRY3BA: QTYDSKRYNGYLGAQDSIDQLPPETTDEPLEKAYSHQLNYAECFLMQDRR 
CRY3A: QTYDSKRNVGAVS-WDSIDQLPPETTDEPLEKGYSHQLNYVMCFLMQGSR 

++++ + +++++++ 



490 500 510 520 530 

CRY3C : GI IPVFTWTHKSVDFYNTLDSEKITQIPFVKAFILVNSTSVVAGPGFTGG 
CRY3BB2 : GTIPFFTWTHRSVDFFNTIDAEKITQLPVVKAYALSSGASI IEGPGFTGG 
CRY3BB : GTIPFFTWTHRSVDFFNTIDAEKITQLPVVKAYALSSGASI IEGPGFTGG 
CRY3BA: GTIPFFTWTHRSVDFFNTIDAEKITQLPVVKAYALSSGASI IEGPGFTGG 
CRY3A: GTIPVLTWTHKSVDFFNMIDSKKITQLPLVKAYKLQSGASVVAGPRFTGG 



+++++++ 



+++++ 



■++++++ 



540 550 560 570 580 

CRY3C: DI I -KCT-NGSGLTLYVTPAPDLTYSKTYKIRIRYASTSQVRFGIDLGSY 
CRY3BB2 : NLLFLKESSNSIAKFKVTL-NSAALLQRYRVRIRYASTTNLRLFVQNSNN 
CRY3BB: NLLFLKESSNSIAKFKVTL-NSAALLQRYRVRIRYASTTNLRLFVQNSNN 
CRY3BA : NLLFLKESSNSIAKFKVTL- NSAALLQRYRVRIRYASTTNLRLFVQNSNN 
CRY3A: DI I - QCTENGS AAT I YVTPD - - VSYSQKYRARIHYASTSQITFTLSLDGA 



++++++++ 



+++++++++++■ 



590 600 610 620 630 

CRY3C: THSISYFDKTMDKGNTLTYNSFNLSSVSRPIEISG-GNKIGVSVGGIGSG 
CRY3BB2: DFIVIYINKTMNIDDDLTYQTFDLATTNSNMGFSGDTNELIIGAESFVSN 
CRY3BB : DFLV I Y I NKTMNKDDDLTYQTFDLATTNSNMGFSGDKNEL I IGAESFVSN 
CRY3BA: DFLVIYINKTMNIDGDLTYQTFDFATSNSNMGFSGDTNDFIIGAESFVSN 
CRY3A: PFNQYYFDKTI NKGDTLTYNSFNLAS FSTPFELSG - -NNLQIGVTGLSAG 



+++++++ 



+++++++ ++++ 



640 650 
CRY3C: DEVYIDKIEFIPMD 
CRY3BB2: EKIYIDKIEFIPVQL 
CRY3BB: EKIYIDKIEFIPVQL 
CRY3BA: EKIYIDKIEFIPVQ 
CRY3A: DKVYIDKIEFIPVN 

+++++++++++++ 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT : 

(A) NAME ; ECOGEN, INC » /MONSANTO COMPANY 

(B) STREET: 2005 CABOT BLVD W/700 CHESTERFIELD VILLAGE 

PKY N 

(C) CITY: LANGHORNE/ST. LOUIS 

(D) STATE: PA/MO 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 819047/63198 

(A) NAME: LEIGH H. ENGLISH 

(B) STREET: 120 CHAPEL DR 

(C) CITY: CHRUCHVILLE 

(D) STATE: PA 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 18966 

(A) NAME: SUSAN M. BRUSSOCK 

(B) STREET: 7 HILLSIDE LN 

(C) CITY: NEW HOPE 

(D) STATE: PA 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) ; 18938 

(A) NAME: THOMAS M. MALVAR 

(B) STREET: 12 046 CHARTER HOUSE LN 

(C) CITY: ST. LOUIS 

(D) STATE: MO 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 63146 

(A) NAME: JAMES W. BRYSON 

(B) STREET: 87 WOOD STREAM DR 

(C) CITY: LANGHORNE 

(D) STATE: PA 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 19053 

(A) NAME: CAROLINE A. KULESZA 

(B) STREET: 301 OLD LYNCHBURG RD 

(C) CITY: CHARLOTTESVILLE 

(D) STATE: VA 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 22903 

(A) NAME: FREDERICK S. WALTERS 

(B) STREET: 3413 6TH AVE 

(C) CITY: BEAVER FALLS 

(D) STATE: PA 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 15010 
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{A) NAME: STEPHEN L. SLATIN 

(B) STREET: 3823 LESLIE PL 

(C) CITY: FAIR LAWN 

(D) STATE: NJ 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 07410 

(A) NAME: MICHAEL A. VON TERSCH 

<B) STREET: 14 RUTLEDGE AVE 

(C) CITY: TRENTON 

(D) STATE: NJ 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : 08618 

(A) NAME: CHARLES ROMANO 

(B) STREET: 2402 MAPLE CROSSING DR 

(C) CITY: WILDWOOD 

(D) STATE: MO 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP) : €3011 

(ii) TITLE OF INVENTION: INSECT -RESISTANT TRANSGENIC PLANTS AND 
METHODS FOR IMPROVING DELTA- ENDOTOXIN ACTIVITY AGAINST 
TARGET INSECTS 

(iii) NUMBER OF SEQUENCES: 113 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(v) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: UNKNOWN 
(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/993,170 

(B) FILING DATE: 18-DEC-1997 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/993,722 

(B) FILING DATE: 18-DEC-1997 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/993,775 

(B) FILING DATE: 18-DEC-1997 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/996,441 

(B) FILING DATE: 18-DEC-1997 



) INFORMATION FOR SEQ ID NO : 1: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1959 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1 . . 1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
1 5 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA TTT CAC CAT TCT CGT CGT TCT 480 
Val Asn Ala Leu Asn ser Trp Lys Lys Phe His His Ser Arg Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA AC A TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 



WO 99/31248 



5 



PCT/US98/26852 



TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 
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AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 60 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Phe His His Ser Arg Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 



His Phe Arg Asn 
180 



Ser Met Pro Ser Phe 
185 



Ala 



Val Ser Lys Phe Glu Val 
190 
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Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 ' 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys val 



385 



390 395 



Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 

405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 
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Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 3: ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . . 1956 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr ~Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 . 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 33 6 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC CTT AGT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Leu Ser Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 
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GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 



GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 



1440 
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TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
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Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 * 70 75 BO 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 

145 150 155 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Leu Ser Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 
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Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
4B5 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys lie Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 



CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 
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Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 " 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 



CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT CCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Pro Glu 
210 215 220 

GAT GTT GCT GAA TTC AGT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Ser His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 



TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 
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Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 110.4 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 168 0 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
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Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 , 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
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115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 l so 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Pro Glu 
210 215 220 

Asp Val Ala Glu Phe Ser His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 *" 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 " 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 
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Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445_ 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 

450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 
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(A) NAME / KEY : CDS 

(B) LOCATION : 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
3S 40 45 



144 



ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



192 



GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 



GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 2 88 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 



CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 



AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 



CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
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195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC TAT CGT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 



GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
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420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
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645 650 

(2) INFORMATION FOR SEQ ID NO : 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 



Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
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225 230 



235 240 



Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 
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lie He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION : 1 . ,1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 
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GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 BO 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC TAT AAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr Asn Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

TCT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Ser Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 



WO 99/31248 



PCT/US98/26852 



27 



ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 
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ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 172 8 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 
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Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Glv Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asd Val Ala Glu Phe Tyr Asn Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Ser Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 " 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
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340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 
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Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO : 11: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1959 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION:!. .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 
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GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 ISO 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC TAT ACC AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr Thr Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 



WO 99/31248 



PCT7US98/26852 



33 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 3B0 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glut Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1532 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 168 0 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 
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TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
1 5 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 



Val Asn Ala Leu Asn ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
" 180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr Thr Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
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450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe zvsp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix} FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION:!. .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 



48 
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Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 



AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 



AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 



CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 



TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
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Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT AAT TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Asn Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 
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His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn- Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 652 amino acids 



WO 99/31248 



40 



PCT/US98/26852 



(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 * 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
ioo 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
1-65 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 
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Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Asn Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 

420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
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565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 1 * . 1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
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85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 



TTT ACG GAT CCA ATT TTT TTA CTT ACT ACG CTT CAG AAG TAC GGA CCA 
Phe Thr Asp Pro He Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
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305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 355 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
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530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 168 0 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 652 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 
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Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 "0 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lvs Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 1 75 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 



Phe Thr Asp Pro He Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
305 310 



315 320 



Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 
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Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION : 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AG A ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala. Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AG A CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT ACC CTT AAT ACA CTA CAG AAG TGC GGA CCA 960 
Phe Thr Asp Pro lie Phe Thr Leu Asn Thr Leu Gin Lys Cys Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 
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TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA-GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT~ GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
.Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 
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AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 
Asn Glu Leu lie He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 



1920 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



1959 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO; 18: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 



His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 
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Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 • 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Thr Leu Asn Thr Leu Gin Lys Cys Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lvs Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 
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Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys lie Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 19; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION : 1 . . 1956 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 



AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



96 
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CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp" Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 2 88 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 
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GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
250 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT GCC GTT AAT ACT CTG TGG GAA TAC GGA CCA 960 
Phe Thr Asp Pro He Phe Ala Val Asn Thr Leu Trp Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 



GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 



1440 
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TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 19 59 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
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15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 ~ 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys *.rg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 
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Phe Thr Asp Pro lie Phe Ala Val Asn Thr Leu Trp Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

lie He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 



Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 



Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO : 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . * 1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 28 8 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
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Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 



GTT AAT GCG TTA AAT TCC TGG AAG AAA 
Val Asn Ala Leu Asn Ser Trp Lys Lys 
145 150 

AAA AGA AGC CAA GAT CGA ATA AGG GAA 
Lys Arg Ser Gin Asp Arg He Arg Glu 
165 

CAT TTT CGT AAT TCC ATG CCG TCA TTT 
His Phe Arg Asn Ser Met Pro Ser Phe 
180 185 



ACA CCT TTA AGT TTG CGA AGT 4 80 

Thr Pro Leu Ser Leu Arg Ser 
155 160 

CTT TTT TCT CAA GCA GAA AGT 528 
Leu Phe Ser Gin Ala Glu Ser 
170 175 

GCA GTT TCC AAA TTC GAA GTG 5 76 

Ala Val Ser Lys Phe Glu Val 
190 



CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 



TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTC TAT CGT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 



GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 



ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 



ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TTA CTT ACT ACG CTT CAG AAG TAC GGA CCA 960 
Phe Thr Asp Pro He Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 



ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
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Tyr Leu Gin Gly He Glu 
340 

GGG AAA GAT TCT TTC AAT 
Gly Lys Asp Ser Phe Asn 
355 

CCT AGT ATA GGA TCT AGT 
Pro Ser He Gly Ser Ser 
370 

AAA TCT ACT GAA CCT GTA 
Lys Ser Thr Glu Pro Val 
385 390 



Phe His Thr Arg Leu Gin 
345 

TAT TGG TCT GGT AAT TAT 
Tyr Trp Ser Gly Asn Tyr 
360 

AAG ACA ATT ACT TCC CCA 
Lys Thr He Thr Ser Pro 
375 380 

CAA AAG CTA AGC TTT GAT 
Gin Lys Leu Ser Phe Asp 
395 



Pro Gly Tyr Phe 
350 

GTA GAA ACT AGA 1104 

Val Glu Thr Arg 

365 

TTT TAT GGA GAT 1152 
Phe Tyr Gly Asp 

GGA CAA AAA GTT 1200 
Gly Gin Lys Val 
400 



TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 



GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 



.AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 



CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 



GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 



TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 



ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 



ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 



ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 



GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
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Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe, Leu Val He 
"580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
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115 120 



125 



Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 I 40 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 

X45 150 155 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 19° 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 



Phe Thr Asp Pro He Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
305 



310 315 320 



Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 3S5 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 
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Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 «5 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 
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(A) NAME /KEY: CDS 

(B) LOCATION : 1 . . 1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
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195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT ACG CCA ACC ACC CTA CAG GAT TAC GGA CCA 960 
Phe Thr Asp Pro He Phe Thr Pro Thr Thr Leu Gin Asp Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 



GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 



1296 
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420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 153 6 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 168 0 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu He lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 



1959 
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645 650 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 



Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr 



Gin Gin Tyr 
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225 230 235 



240 



Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Thr Pro Thr Thr Leu Gin Asp Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser lie Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys 3er Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 
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He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 . 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO : 25: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 
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GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 



ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 



864 
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ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AG A GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT GCC CTG AAT ACC TTA GAC GAG TAC GGA CCA 960 
Phe Thr Asp Pro lie Phe Ala Leu Asn Thr Leu Asp Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 
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ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT v^AA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys He Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 
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Thr Glu Asp Ser Ser Thr Glu val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 6° 

Ala val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
55 * 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 11° 

Gin val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 l" 75 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 19° 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
24S 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ala Leu Asn Thr Leu Asp Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
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345 350 



Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 4S0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 



WO 99/31248 



76 



PCT/US98/26852 



Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 
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GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AG A AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly ber Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAC GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ACT AGG CGA TTC AGA AAG GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
Thr Arg Arg Phe Arg Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 
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CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA- CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 
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TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1359 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 



Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 ~ 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Thr Arg Arg Phe Arg Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 ** 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
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450 



455 



460 



Asp Glu Pro Leu Glu 
465 



Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
470 475 480 



Cys Phe Leu Met Gin 
485 



Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
490 495 



Thr His Arg Ser Val 
500 



Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
505 510 



Thr Gin Leu Pro Val 
515 



Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
520 525 



He He Glu Gly Pro 
530 



Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
535 540 



Glu Ser Ser Asn Ser 
545 



He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
550 555 560 



Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 195 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION : 1 . .1956 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
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Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
- 20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
S5 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 2 88 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 ~ 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 



GAT GTT GCT GAA TTC TAT CGT AGA CAA TTA AAA CTT ACA CAA CAA TAC 



720 
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Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TTA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Leu Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 
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His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 60S 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 30: 



Ci) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 652 amino acids 
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(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 30: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys val Thr Pro 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 * 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 
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Met Thr Leu Thr val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 " 295 300 

Phe Thr Asp Pro He Phe Leu Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr val Glu Thr Arg 
355 3S0 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 ~ 535 540 



Glu Ser Ser Asn Ser He Ala Lys Phe Lys val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
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565 570 575 

Asn Leu Arg Leu J?he Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys lie Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION : 1 . . 1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
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85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 33 6 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT ATC CTC AAT ACG CTA CAG GAG TAC GGA CCA 960 
Phe Thr Asp Pro He Phe He Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
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305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
435 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 



ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 



1632 
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530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
1 5 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 



WO 99/31248 



PCT/US98/26852 



91 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 . HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe He Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 ~ 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 
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Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 33: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . » 1956 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 48 0 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT ATC CTA CAT ACG CTG CAG GAG TAC GGA CCA 960 
Phe Thr Asp Pro He Phe He Leu His Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 



AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 
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TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 
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AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 



His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu val 
180 185 190 
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Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe He Leu His Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp, Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 



Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 
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Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser _Gly Ala Ser 
515 " 520 525 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION : 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr .Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 
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GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT ,GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCC CTC GTT AAC CTA ATG GTG TAC GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Val Asn Leu Met Val Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val ^yr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro L u Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 
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TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 8 8 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAG ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys Tie Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
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1 5 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser "Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 ISO 155 16° 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 21S 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 
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Phe Thr Asp Pro He Phe Ser Leu Val Asn Leu Met Val Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION : 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 



CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 
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Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCT CTT AGG ACA CCA CTT GCG TAC GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Arg Thr Pro Leu Ala Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
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Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355' 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA . ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT >TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
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Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAG AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu He lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 



Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
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115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 I 75 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Arg Thr Pro Leu Ala Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 
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Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 4 55 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 4 ?S 4 *0 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 4 *0 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 6°5 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 
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(A) NAME /KEY: CDS 

(B) LOCATION ; 1 . .1956 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 



CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
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195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 

225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 

245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 

260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TTC AAT 864 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Phe Asn 
275 280 285 

ATT TTG CTT TAC AGT AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 

He Leu Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 96 0 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 

305 *" 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 

325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 

340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 

385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 124 8 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 

405 410 415 



GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
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420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 8 0 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
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645 650 



(2) INFORMATION FOR SEQ ID NO : 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 9S 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 I 75 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
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225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Phe Asn 
275 280 285 

He Leu Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 

290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 
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lie He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 
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GCA GTT GGG AC A GGA ATT TCT GTT GTA GGG GAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 
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ATT GTG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Val Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 96 0 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 100 B 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 
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ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1532 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 80 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SBQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 652 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 ' 40 45 
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Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 * 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 no 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Val Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
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340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 
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Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION:!. .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 
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GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 72 0 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val" Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 
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CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 144 0 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 15 84 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 17 76 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 
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TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA AC A TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG.GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
S10 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 



Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 30° 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 ^ 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
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450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 
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Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 



AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



96 



CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 



144 



ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



192 



GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 



AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AAT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn 
145 150 155 160 

CCA CAC AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Pro His Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 



CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 



CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 



GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 
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Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 96 0 

Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 12 00 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 
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His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 16 8 0 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 5B5 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 195 9 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 46: 



(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 652 amino acids 
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<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 .80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn 
145 150 155 160 

Pro His Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 1B5 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 
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Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 " 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
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565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO : 47: 

{i> SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1959 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
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85 90 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 76 8 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 86 4 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

GTT CGG TTA TAC CCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 



TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
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305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 12 96 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 



ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
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530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 48: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 B0 
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Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lvs Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 
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Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu lie He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr lie Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . . 1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GG^ GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 2 88 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AAT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn 
145 150 155 160 

CCA CAC AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Pro His Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
' 260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

GTT CGG TTA TAC CCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 . 400 
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TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 



TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 



1872 
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AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 "5 "0 



TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



1959 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn 
145 150 155 160 

Pro His Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 17 5 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 
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Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 

225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 



245 



250 



255 



Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 

315 320 



305 



310 



Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 
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Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys lie 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe val Ser Asn Glu Lys lie 
625 630 635 640 

Tyr lie Asp Lys He Glu Phe lie Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1956 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1 . .1953 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
1 5 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp" Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 2 88 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAA GAC CCA TGG AAG GCT TTT ATG GCA CAA 336 
Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 



GTT GAA GTA CTG ATA GAT 
Val Glu Val Leu lie Asp 
115 

GCT CTT GCA GAG TTA CAG 
Ala Leu Ala Glu Leu Gin 
130 

AAT GCG TTA AAT TCC TGG 
Asn Ala Leu Asn Ser Trp 
145 150 



AAG AAA ATA GAG GAG TAT 
Lys Lys He Glu Glu Tyr 
120 

GGT CTT CAA AAT AAT TTC 
Gly Leu Gin Asn Asn Phe 
135 140 

AAG AAA ACA CCT TTA AGT 
Lys Lys Thr Pro Leu Ser 
155 



GCT AAA AGT AAA 3 84 

Ala Lys Ser Lys 

125 

GAA GAT TAT GTT 432 
Glu Asp Tyr Val 

TTG CGA AGT AAA 4 80 

Leu Arg Ser Lys 
160 



AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT CAT 
Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 



528 



TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG CTG 
Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 



576 



TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA TTA 
Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 



624 



AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA GAT 
Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 



672 



GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC ACT 720 
Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 



GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA GGT 
Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 



768 
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TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA ATG 816 
Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT ATT 864 
Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp lie 
275 280 285 

CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT TTT 912 
Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie Phe 
290 295 300 

ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA ACT 960 
Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 

TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT TAT 1008 
Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 

TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT GGG 1056 
Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 

AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA CCT 1104 
Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT AAA 1152 
Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp Lys 
370 375 380 

TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT TAT 12 0 0 

Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val Tyr 
385 390 395 400 

CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG GTA 1248 
Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA AAA 1296 
Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
420 425 430 

AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC CAT 1344 
Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA GAT 1392 
Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 



GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA TGT 
Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 470 475 480 



1440 
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TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG ACA 1488 
Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp Thr 
485 490 495 

CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT ACT 1536 
His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys He Thr 
500 505 510 

CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC ATT 1584 
Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser He 
515 520 525 

ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA GAA 1632 
He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 535 540 

TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA GCC 1680 
Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT AAC 1728 
Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC TAC 1776 
Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He Tyr 
580 585 590 

ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA TTT 1824 
He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gl:. Thr Phe 
595 600 605 

GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG AAT 1872 
Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 

GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC TAT 1920 
Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He Tyr 
625 630 635 640 

ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1956 
He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
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l 



5 



10 



15 



Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 



Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 

Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 

Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser Lys 
145 150 155 160 

Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 

Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 

Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp He 
275 280 285 

Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He Phe 
290 295 300 



20 



25 



30 
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Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 

Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 

Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 

Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp Lys 
370 375 380 

Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val Tyr 
385 390 395 400 

Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
420 425 430 

Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 

Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 470 475 480 

Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp Thr 
485 490 495 

His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He Thr 
500 505 510 

Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser He 
515 520 525 

He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 535 540 

Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He Tyr 
580 585 590 

He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr Phe 
595 600 605 
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Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 

Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He Tyr 
625 630 635 640 

He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1 . ,1956 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 53: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 



CAA GTT GAA GTA CTG ATA GAT AAG AAA. ATA GAG GAG TAT GCT AAA AGT 



384 
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Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC GGA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Gly Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
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Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
*50 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 16,32 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
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Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe .Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1320 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser. Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 



Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
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115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn. Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 I 75 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Gly Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 * 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 
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Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 „ 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr He Asn Lys . Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1956 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 



(ix) FEATURE: 



<6\ 
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(A) NAME /KEY : CDS 

(B) LOCATION : 1 . . 1953 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 

15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 



GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 



288 



AAC ACT ATA TGG CCA AGT GAA GAC CCA TGG AAG GCT TTT ATG GCA CAA 
Asn Thr lie Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 



336 



GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT AAA 3 84 

Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 



GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT GTT 432 
Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AAT CCA 4 80 

Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn Pro 
145 150 155 160 



CAC AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT CAT 528 
His Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG CTG 576 
Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 



TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA 
Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu 



TTG CTA TTA 
Leu Leu Leu 



624 
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195 200 205 

AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA GAT 672 
Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
' 210 215 220 

GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC ACT 720 
Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA GGT 768 
Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 

TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA ATG 816 
Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT ATT 864 
Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp lie 
275 280 285 

CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT TTT 912 
Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie Phe 
290 295 300 

ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA ACT 960 
Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 

TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT TAT 1008 
Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 

TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT GGG 1056 
Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 

AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA CCT 1104 
Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT AAA 1152 
Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp Lys 
370 375 380 

TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT TAT 120 0 

Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val Tyr 
385 390 395 400 

CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG GTA 124 8 

Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA AAA 1296 
Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
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420 425 430 

AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC CAT 1344 
Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA GAT 1392 
Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 

GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA TGT 144 0 

Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 ' 470 475 480 

TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG ACA 14 88 

Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp Thr 
485 490 495 

CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT ACT 1536 
His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He Thr 
500 505 510 

CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC ATT 15 84 

Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser He 
515 520 525 

ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA GAA 163 2 

He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 535 540 

TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA GCC 1680 
Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT AAC 172 8 

Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC TAC 1776 
Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He Tyr 
580 585 590 

ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA TTT 1824 
He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr Phe 
595 600 605 

GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG AAT 1872 
Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 

GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC TAT 1920 
Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He Tyr 
625 630 635 640 

ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1956 
He Asp Lys He Glu Phe He Pro Val Gin Leu 
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650 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
! 5 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 HO 

Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 

Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Asn Pro 
145 150 155 160 

His Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 

Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
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225 230 235 



240 



Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 

Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp He 
275 280 285 

Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He Phe 
290 295 300 

Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 

Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 

Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 

Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp Lys 
370 375 380 

Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Vr.l Tyr 
385 390 395 400 

Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
420 425 430 

Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

Val Ser Ala Gin Asp- Ser He Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 

Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 470 475 480 

Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp Thr 
485 490 495 

His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He Thr 
500 505 510 

Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser He 
515 520 525 
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lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 535 540 

Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

Leu Arg Leu Phe Val Gin Asn ser Asn Asn Asp Phe Leu Val lie Tyr 
580 585 590 

lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr Phe 
595 600 605 

Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 

Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie Tyr 
625 630 635 640 

lie Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1956 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . .1953 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



WO 99/31248 



PCT/US98/26852 



161 



GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAA GAC CCA TGG AAG GCT TTT ATG GCA CAA 336 
Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 110 

GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT AAA 384 
Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser Lys 
115 120 125 

GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT GTT 432 
Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

AAT GCG TTA AAT TCC TGG AAG AAA TTT CAC CAT TCT CGT CGT TCT AAA 480 
Asn Ala Leu Asn Ser Trp Lys Lys Phe His His Ser Arg Arg Ser Lys 
145 150 155 160 

AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT CAT 528 
Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG CTG 576 
Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA TTA 624 
Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 

AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA GAT 672 
Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC ACT 720 
Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA GGT 76 8 

Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 

TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA ATG 816 
Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 



ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT ATT 
Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp He 
275 280 285 
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CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT TTT 912 
Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie Phe 
290 295 300 

ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA ACT 960 
Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 310 315 320 

TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT TAT 1008 
Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 

TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT GGG 1056 
Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe Gly 
340 345 350 



AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA CCT 1104 
Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 



AGT ATA GGA TCT AGT AAG ACA ATT ACT 
Ser He Gly Ser Ser Lys Thr He Thr 
370 375 

TCT ACT GAA CCT GTA CAA AAG CTA AGC 
Ser Thr Glu Pro Val Gin Lys Leu Ser 
385 390 

CGA ACT ATA GCT AAT ACA GAC GTA GCG 
Arg Thr He Ala Asn Thr Asp Val Ala 
405 

TAT TTA GGT GTT ACG AAA GTT GAT TTT 
Tyr Leu Gly Val Thr Lys Val Asp Phe 
420 425 



TCC CCA TTT TAT GGA GAT AAA 1152 
Ser Pro Phe Tyr Gly Asp Lys 
380 

TTT GAT GGA CAA AAA GTT TAT 1200 
Phe Asp Gly Gin Lys Val Tyr 
395 400 

GCT TGG CCG AAT GGT AAG GTA 1248 
Ala Trp Pro Asn Gly Lys Val 
410 415 

AGT CAA TAT GAT GAT CAA AAA 1296 
Ser Gin Tyr Asp Asp Gin Lys 
430 



AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC CAT 1344 
Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA GAT 13 92 

Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 



GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA TGT 1440 
Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 470 475 480 



TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG ACA 1488 
Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp Thr 
485 490 495 



CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT ACT 
His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He Thr 
500 505 510 



1536 
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CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC ATT 1564 
Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser lie 
515 520 525 

ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA GAA 1632 
lie Glu Gly Pro Gly Phe Thr Gly Gly As n Leu Leu Phe Leu Lys Glu 
530 535 540 

TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA GCC 1680 
Ser Ser Asn Ser lie Ala Lys Phe Lys val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT AAC 1728 
Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC TAC 1776 
Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie Tyr 
580 585 590 

ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA TTT 1824 
He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr Phe 
595 600 605 

GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG AAT 1872 
Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 

GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC TAT 1920 
Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He Tyr 
625 630 635 640 

ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1956 
He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 58; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 651 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 



Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 
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Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Glu Asp Pro Trp Lys Ala Phe Met Ala Gin 
100 105 HO 

Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser Lys 
H5 120 I 25 

Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr Val 
130 135 140 

Asn Ala Leu Asn Ser Trp Lys Lys Phe His His Ser Arg Arg Ser Lys 
145 150 155 160 

Arq Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His 
165 170 175 

Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val Leu 
180 185 190 

Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu 
195 200 205 

Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu Asp 
210 215 220 

Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr Thr 
225 230 235 240 

Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg Gly 
245 250 255 

Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu Met 
260 265 270 

Thr Leu Thr Val Leu Asp Leu He val Leu Phe Pro Phe Tyr Asp He 
275 280 285 

Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He Phe 
290 295 300 



Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro Thr 
305 



310 315 320 



Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp Tyr 
325 330 335 

Leu Gin Gly He Glu Phe His Thr Arg L u Gin Pro Gly Tyr Phe Gly 
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340 345 



350 



Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg Pro 
355 360 365 

Ser He Gly Ser Ser Lys Thr lie Thr Ser Pro Phe Tyr Gly Asp Lys 

370 375 380 



Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val Tyr 
390 395 400 



Ser 
385 

Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys Val 
405 410 415 

Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin Lys 
420 425 430 

Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly His 
435 440 445 

val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr Asp 
450 455 460 



Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu Cys 
465 



470 475 480 



Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp Thr 
485 490 495 

His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He Thr 
500 505 510 

Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser He 
515 520 525 

He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys Glu 
530 * 535 540 

Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala Ala 
545 550 555 560 

Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr Asn 
565 570 575 

Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He Tyr 
580 585 590 

He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr Phe 
595 600 605 

Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys Asn 
610 615 620 

Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He Tyr 
625 630 635 640 
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lie Asp Lys lie Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 59; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 195 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION:!. .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 33 6 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 4 32 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 
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GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Gly Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 5 76 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 235 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT ACC CTT AAT ACA CTA CAG AAG TAC GGA CCA 960 
Phe Thr Asp Pro lie Phe Thr Leu Asn Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 



GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 
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CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 350 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 



AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 
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TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA AC A TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA l^ 59 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO : 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 * 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 
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Lys Arg Ser Gin Gly Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys. Phe Glu Val 
~180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

Phe Thr Asp Pro He Phe Thr Leu Asn Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
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450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 *80 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 *40 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA 
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Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 28 8 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 4 80 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 6 72 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
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Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

GTT CGG TTA TAC CCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TCT ACG GAT CCA ATT TTT GCC GTT AAT ACT CTG TGG GAA TAC GGA CCA 960 
Ser Thr Asp Pro He Phe Ala Val Asn Thr Leu Trp Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CGA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GCA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Ala Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 



CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 



1392 



WO 99/31248 



PCT/US98/26852 



174 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 " 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 14 88 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 15 84 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 192 0 

Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 62: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 652 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ma Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 



Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 
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Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Ser Thr Asp Pro He Phe Ala Val Asn Thr Leu Trp Glu Tyr Gly Pro 
305 ^ 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Ala Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
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565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr lie Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
1 5 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
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85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 336 
Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 - 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

GTT CGG TTA TAC CCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
Val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
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305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CGA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly lie Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 350 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 



ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 
He lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
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530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 168 0 

Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 18 72 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu lie lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 
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Gly 



Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 9° 95 



Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 1° 5 110 

Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
H5 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 ISO 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 . 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 19° 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 

275 280 285 

val Arg Leu Tyr Pro Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 
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Lvs Ser Thr Glu Pro val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
"405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 



Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
SIS 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 



Asn 
625 



Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
630 635 640 



Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 65: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION : 1 . .1956 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GGT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 
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CAT TTT CGT AAT TC'C ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AG A CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp lie 
290 295 300 

TTT ACG GAT CCA ATT TTT TTA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro lie Phe Leu Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser lie Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 



AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 



1200 
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TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser lie Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr lie Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr lie Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 
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AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 
Asn Glu Leu lie lie Gly Ala Olu Ser Phe Val Ser Asn Glu Lys lie 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 
Tyr lie Asp Lys lie Glu Phe lie Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 11° 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 



1920 



1959 
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Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 " 295 300 

Phe Thr Asp Pro He Phe Leu Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 
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Thr His Arg Ser Val Asp Phe Phe 
500 

Thr Gin Leu Pro Val Val Lys Ala 

515 520 

He He Glu Gly Pro Gly Phe Thr 
530 535 

Glu Ser Ser Asn Ser lie Ala Lys 
545 550 

Ala Leu Leu Gin Arg Tyr Arg Val 

565 

Asn Leu Arg Leu Phe Val Gin Asn 
580 

Tyr He Asn Lys Thr Met Asn Lys 
595 600 

Phe Asp Leu Ala Thr Thr Asn 
610 615 

Asn Glu Leu He He Gly Ala Glu 
625 630 

Tyr He Asp Lys He Glu Phe He 
645 



Asn Thr He Asp Ala Glu Lys lie 
505 510 

Tyr Ala Leu Ser Ser Gly Ala Ser 
525 

Gly Gly Asn Leu Leu Phe Leu Lys 
540 

Phe Lys Val Thr Leu Asn Ser Ala 
555 560 

Arg He Arg Tyr Ala Ser Thr Thr 
570 575 

Ser Asn Asn Asp Phe Leu Val He 
585 590 

Asp Asp Asp Leu Thr Tyr Gin Thr 
605 



Ser Phe Val Ser Asn Glu Lys He 

635 640 

Pro Val Gin Leu 
650 



Ser Asn Met Gly Phe Ser Gly Asp Lys 
620 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 195 9 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . .1956 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 48 
Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
1-5 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 
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CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 240 
Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 33 6 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 384 
Gin Val Glu Val Leu lie Asp Lys Lys lie Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 48 0 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 52 8 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 
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GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA -TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 960 
Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CGA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Arg Pro Gly .Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 1392 
His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 
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TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys lie 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 
He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 19 5 S 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 52 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
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15 10 I 5 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Glv Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 
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Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 " 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Arg Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 • 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 
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Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn jGIu Lys He 
625 " 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 1482 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION : 1 . .1479 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

AGT AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA 48 
Ser Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu 
15 10 15 

AGT CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA 96 
Ser His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu 
20 25 30 

GTG CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG 144 
Val Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu 
35 40 45 

CTA TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA 192 
Leu Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser 
50 55 60 

GAA GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA 240 
Glu Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin 
65 70 75 80 

TAC ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA 28 8 

Tyr Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu 
85 90 95 



AGA GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA 
Arg Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg 
100 105 110 



336 
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GAA ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT 3 84 

Glu Met Thr Leu Thr Val Leu Asp Leu lie Val Leu Phe Pro Phe Tyr 
115 120 125 

GAT ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC 432 
Asp lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp 
130 135 140 

ATT TTT ACG GAT CCA AXX ttt TCA CTT AAT ACT CTT CAG GAG TAT GGA 480 
He Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly 
145 150 155 160 

CCA ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT 528 
Pro Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe 
165 170 175 

GAT TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC 576 
Asp Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr 
180 185 190 

TTT GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT 624 
Phe Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr 
195 200 205 

AGA CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA 672 
Arg Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly 
210 215 220 

GAT AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA 720 
Asp Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys 
225 230 235 240 

GTT TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT 768 
Val Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly 
245 250 255 

AAG GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT 816 
Lys Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp 
• 260 265 270 

CAA AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT 864 
Gin Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn 
275 280 285 

GGC CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA 912 
Gly His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr 
290 295 300 

ACA GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG 960 
Thr Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala 
305 310 315 320 

GAA TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT 10 0B 

Glu Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr 
325 330 335 
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TGG ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG 1056 
Trp Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys 
340 345 350 

ATT ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT 1104 
He Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala 
355 360 365 

TCC ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA 1152 
Ser He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu 
370 375 380 

AAA GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA 1200 
Lys Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser 
385 390 395 400 

GCA GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC 1248 
Ala Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr 
405 410 415 

ACT AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC 1296 
Thr Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val 
420 425 430 

ATC TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA 1344 
He Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin 
435 440 445 

ACA TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT 13 92 

Thr Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp 
450 455 460 

AAG AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA 1440 
Lys Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys 
465 470 475 480 

ATC TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1482 
He Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
485 490 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 493 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Ser Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu 
15 10 15 



Ser His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu 
20 25 30 
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Val Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu 
35 40 45 

Leu Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser 
50 55 SO 

Glu Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin 
65 70 75 80 

Tyr Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu 
85 90 95 

Arg Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg 
100 105 HO 

Glu Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr 
115 120 125 

Asp He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp 
130 135 140 

He Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly 
145 150 155 160 

Pro Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe 
165 170 175 

Asp Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr 
180 185 190 

Phe Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr 
195 200 205 

Arg Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly 
210 215 220 

Asp Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys 
225 230 235 240 

Val Tyr Arg Thr lie Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly 
245 250 255 

Lys Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp 
260 * 265 270 

Gin Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn 
275 280 285 

Gly His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr 
290 295 300 

Thr Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala 
305 310 315 320 

Glu Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr 
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325 330 



335 



Trt> Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys 
340 345 350 

He Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala 
355 360 365 

Ser He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu 
370 375 380 

Lys Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser 
385 390 395 400 

Ala Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr 
405 410 415 

Thr Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val 
420 425 430 

He Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin 
435 440 445 

Thr Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp 
450 455 460 

Lys Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys 
465 470 475 480 

He Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
485 490 

(2) INFORMATION FOR SEQ ID NO : 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
AGACAACTCT ACAGTAAAAG ATG 
(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
GGTAATTGGT CAATAGAATC 20 
(2) INFORMATION FOR SEQ ID NO : 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 21. .23 

(D) OTHER INFORMATION: /note= "N = A, T, G or C" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
CAGAAGATGT TGCTGAATTC NNNCATAGAC AATTAAAAC 3 9 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : modif iedjaase 

(B) LOCATION: 19. .21 

(D) OTHER INFORMATION: /note= "N = A, T, G or C" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
GATGTTGCTG AATTCTATNN NAGACAATTA AAAC 34 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 17 

(D) OTHER INFORMATION: /note= "N = A, T, C or G« 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 18 

(D) OTHER INFORMATION; /note= "N « T, G, C or A" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 19 

(D) OTHER INFORMATION: /not e= n N = A, T, G or C" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
CCCATTTTAT GATATTNNNT TATACTCAAA AGG 
(2) INFORMATION FOR SEQ ID NO : 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: /not e= "N = T, G, C or A" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION : one - of ( 2 5 , 27, 28, 30, 34, 36, 39, 43) 
(D) OTHER INFORMATION: /not e= "N - A, T, G or C" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one-of (31, 33, 35, 37, 42, 44) 

(D) OTHER INFORMATION :/note= "N = A, G, C or T" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 40 

(D) OTHER INFORMATION :/note= "N = A, T, C or G M 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one -of (26, 29, 32, 38, 41) 

(D) OTHER INFORMATION: /note^ "N - A, T, G or C» 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
AGCTATGCTG GTCTCGGAAG AAANNNNNNN NNNNNNNNNN NNNNAAAAGA AGCCAAGATC 6 0 

GAAT 64 
(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
GGTCACCTAG GTCTCTCTTC CAGGAATTTA ACGCATTAAC 40 
(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 65 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one -Of (22, 27, 29, 30, 37, 42) 

(D) OTHER INFORMATION: /not e= "N ■ A, G, C or T" 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: one -of (23, 26, 28, 31, 38, 40, 43, 44) 
(D) OTHER INFORMATION: /note= "N - T, G, C or A" 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION :one-Of (24, 39) 

(D) OTHER INFORMATION: /no te= "N = A, T, G or C" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one -Of (25, 32, 33, 41, 46, 47, 48) 
(D) OTHER INFORMATION :/note= "N = A, T, Cor G" 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 
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(B) LOCATION: 34 

(D) OTHER INFORMATION: /note* "N « A, T, G or C" 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 45 

(D) OTHER INFORMATION: /note= "N = A, T, G or C" 

(ix) FEATURE: 

(A) NAME/ KEY : modif ied_base 

(B) LOCATION: 35. .36 

(D) OTHER INFORMATION : / note= "N = A, G, C or T" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
AGCTATGCTG GTCTCCCATT TNNNNNNNNN NNNNNNNNNN NNNNNNNNGT TAAAACAGAA 60 
CTAAC 65 
(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
ATCCAGTGGG GTCTCAAATG GGAAAAGTAC AATTAG 36 
(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



( ix) FEATURE : 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: one- of (23, 27, 31, 36, 44) 

(D) OTHER INFORMATION:/ not e= "N = A, G, C or T" 

(ix) FEATURE: 

(A) NAME/ KEY: modif ied_base 

(B) LOCATION: one -of (24, 25, 26, 33, 35, 38) 

(D) OTHER INFORMATION : / not e= "N « A, T, G or C" 
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(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one- Of (28, 34, 37) 

(D) OTHER INFORMATION: /note= "N = A, T, G or C" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION rone -Of (29, 30, 32, 39, 42, 45) 

(D) OTHER INFORMATION :/note= "N = T, G, C or A" 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one- of (40, 43) 

(D) OTHER INFORMATION :/note= "N = A, T, C or G» 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 41 

(D) OTHER INFORMATION: /note= ,! N « A, C, T or G" 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: 46 

(D) OTHER INFORMATION :/note= "N = A, T, G or C" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
CATTTTTACG GATCCAATTT TTNNNNNNNN NNNNNNNNNN NNNNNNGGAC CAACTTTTTT 60 

63 

GAG 

(2) INFORMATION FOR SEQ ID NO : 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one -of (28, 31, 32, 33, 42) 

(D) OTHER INFORMATION: /note« "N - A, G, C or T" 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION : one - of ( 2 9 , 38, 39, 41) 

(D) OTHER INFORMATION :/note= "N = T, G, C or A" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 30 

(D) OTHER INFORMATION :/note= "N = A, T, G or C" 
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(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one -of (34, 35, 40) 

(D) OTHER INFORMATION: /note = "N = A, T, C or G" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 3 6 

(D) OTHER INFORMATION: /note= "N - A, T, G or C" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 3 7 

(D) OTHER INFORMATION: /note= "N = A, T, G or C" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 81: 
GAATTTCATA CGCGTCTTCA ACCTGGTNNN NNNNNNNNNN NNTCTTTCAA TTATTGGTCT 60 
GG 

(2) INFORMATION FOR SEQ ID NO : 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : 1 inear 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: one-of (41, 49, 52) 

(D) OTHER INFORMATION: /note= "N = A, G, C or T" 

(ix) FEATURE: 

(A) NAME /KEY : raodif ied_base 

(B) LOCATION: 42. .43 

(D) OTHER INFORMATION: /note= "N - A, T, C or G" 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 44. .45 

(D) OTHER INFORMATION: /note= "N = A, T, G or C" 

(ix) FEATURE: 

(A) NAME /KEY: modif iedjbase 

(B) LOCATION: 46 

(D) OTHER INFORMATION :/note= "N = A, T, G or C M 

(ix) FEATURE: 

(A) NAME /KEY : modif iedjbase 

(B) LOCATION: one -of (47, 48, 53, 54) 
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(D) OTHER INFORMATION: /note = M N = T, G, C or A" 

(ix) FEATURE: 

(A) NAME /KEY : modif iedj&ase 

(B) LOCATION : one -of (50, 51, 55) 

(D) OTHER INFORMATION :/note= n N = A, T, C or G" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
AAAAGTTTAT CGAACTATAG CTAATACAGA CGTAGCGGCT NNNNNNNNNN NNNNNGTATA 60 



TTTAGGTGTT ACG 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) 

GGAGTTCCAT TTGCTGGGGC 
(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



73 



20 



SEQUENCE DESCRIPTION: SEQ ID NO: 83: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
ATCTC CATAA AATGGGG 17 
(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 85: 
GCGAAGTAAA AGAAGCCAAG GTCGAATAAG GG 
(2) INFORMATION FOR SEQ ID NO : 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
CCTTTAAGTT TGCGAAATCC ACACAGCCAA GGTCGAATAA GGG 
(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
CCCATTTTAT GATGTTCGGT TATACCCAAA AGGGG 
(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 
GGCCAAGTGA AGACCCATGG AAGGC 
(2) INFORMATION FOR SEQ ID NO: 89: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(P) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
GCAGTTTCCG GATTCGAAGT GC 
(2) INFORMATION FOR SEQ ID NO : 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
CCGCTACGTC TGTATTA 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
ATAATGGAAG CACCTGA 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION : one- of (22 , 26, 29) 

(D) OTHER INFORMATION: /note= "N = T, G, C or A" 

(ix) FEATURE: 

(A) NAME /KEY: modif iedjaase 

(B) LOCATION rone- of (23, 33, 36) 

(D) OTHER INFORMATION: /note= "N = A, G, C or T» 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: one -Of (24, 27, 28, 32, 35, 37, 38) 
(D) OTHER INFORMATION :/note= "N = A, T, C or G M 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: one -of (25, 30, 31, 34) 

(D) OTHER INFORMATION: /note = "N = A, T, G or C n 

(ix) FEATURE: 

(A) NAME /KEY : modif ied__base 

(B) LOCATION: 39 

(D) OTHER INFORMATION :/note= "N = A, T, G or C" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 
AGCTATGCTG GTCTCTTCTT ANNNNNNNNN NNNNNNNNNA CAATTCCATT TTTTACTTGG 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 
ATCCAGTTGG GTCTCTAAGA AACAAACCGC GTAATTAAGC 
(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
CCTCAAGGGT TATAACATCC 
(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE : 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one -of (19, 22, 23, 31) 

(D) OTHER INFORMATION :/note= "N = A, T, C or G" 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION: one -of (2 0, 26, 27, 29, 30, 35) 

(D) OTHER INFORMATION :/note= "N = T, G, C or A" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one -Of (21, 32, 34) 

(D) OTHER INFORMATION : /note- "N = A, G, C or T H - 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: one -of (24, 33) 

(D) OTHER INFORMATION :/note= "N = A, T, G or C" 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION : 2 5 

(D) OTHER INFORMATION :/note= "N = A, G, T or C" 

(ix) FEATURE: 

(A) NAME /KEY: tnodif ied_base 

(B) LOCATION; 28 

(D) OTHER INFORMATION: /note= "N = A, T, G or C" 

(ix) FEATURE: 

(A) NAME /KEY : modif iedjaase 

(B) LOCATION: 36 

(D) OTHER INFORMATION : /note= "N = A, G, C or T" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 
GTACAAAAGC TAAGCTTTNN NNNNNNNNNN NNNNNNCGAA CTATAGCTAA TACAG 
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(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Ser Lys Arg Ser Gin Asp Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION : 1 . ,1956 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 97: 

ATG AAT CCA AAC AAT CGA AGT GAA CAT GAT ACG ATA AAG GTT ACA CCT 4 8 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Val Thr Pro 
15 10 15 

AAC AGT GAA TTG CAA ACT AAC CAT AAT CAA TAT CCT TTA GCT GAC AAT 96 
Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

CCA AAT TCA ACA CTA GAA GAA TTA AAT TAT AAA GAA TTT TTA AGA ATG 144 
Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

ACT GAA GAC AGT TCT ACG GAA GTG CTA GAC AAC TCT ACA GTA AAA GAT 192 
Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

GCA GTT GGG ACA GGA ATT TCT GTT GTA GGG CAG ATT TTA GGT GTT GTA 24 0 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly val Val 
65 70 75 80 

GGA GTT CCA TTT GCT GGG GCA CTC ACT TCA TTT TAT CAA TCA TTT CTT 288 
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Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

AAC ACT ATA TGG CCA AGT GAT GCT GAC CCA TGG AAG GCT TTT ATG GCA 3 36 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala -Phe Met Ala 
100 105 110 

CAA GTT GAA GTA CTG ATA GAT AAG AAA ATA GAG GAG TAT GCT AAA AGT 3 84 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

AAA GCT CTT GCA GAG TTA CAG GGT CTT CAA AAT AAT TTC GAA GAT TAT 432 
Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

GTT AAT GCG TTA AAT TCC TGG AAG AAA ACA CCT TTA AGT TTG CGA AGT 480 
Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

AAA AGA AGC CAA GAT CGA ATA AGG GAA CTT TTT TCT CAA GCA GAA AGT 528 
Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

CAT TTT CGT AAT TCC ATG CCG TCA TTT GCA GTT TCC AAA TTC GAA GTG 576 
His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

CTG TTT CTA CCA ACA TAT GCA CAA GCT GCA AAT ACA CAT TTA TTG CTA 624 
Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

TTA AAA GAT GCT CAA GTT TTT GGA GAA GAA TGG GGA TAT TCT TCA GAA 672 
Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

GAT GTT GCT GAA TTT TAT CAT AGA CAA TTA AAA CTT ACA CAA CAA TAC 720 
Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

ACT GAC CAT TGT GTT AAT TGG TAT AAT GTT GGA TTA AAT GGT TTA AGA 768 
Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

GGT TCA ACT TAT GAT GCA TGG GTC AAA TTT AAC CGT TTT CGC AGA GAA 816 
Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

ATG ACT TTA ACT GTA TTA GAT CTA ATT GTA CTT TTC CCA TTT TAT GAT 864 
Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

ATT CGG TTA TAC TCA AAA GGG GTT AAA ACA GAA CTA ACA AGA GAC ATT 912 
lie Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 



TTT ACG GAT CCA ATT TTT TCA CTT AAT ACT CTT CAG GAG TAT GGA CCA 



960 
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Phe Thr Asp Pro lie Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

ACT TTT TTG AGT ATA GAA AAC TCT ATT CGA AAA CCT CAT TTA TTT GAT 1008 
Thr Phe Leu Ser lie Glu Asn Ser lie Arg Lys Pro His Leu Phe Asp 
325 330 335 

TAT TTA CAG GGG ATT GAA TTT CAT ACG CGT CTT CAA CCT GGT TAC TTT 1056 
Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

GGG AAA GAT TCT TTC AAT TAT TGG TCT GGT AAT TAT GTA GAA ACT AGA 1104 
Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

CCT AGT ATA GGA TCT AGT AAG ACA ATT ACT TCC CCA TTT TAT GGA GAT 1152 
Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

AAA TCT ACT GAA CCT GTA CAA AAG CTA AGC TTT GAT GGA CAA AAA GTT 1200 
Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

TAT CGA ACT ATA GCT AAT ACA GAC GTA GCG GCT TGG CCG AAT GGT AAG 1248 
Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

GTA TAT TTA GGT GTT ACG AAA GTT GAT TTT AGT CAA TAT GAT GAT CAA 1296 
Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

AAA AAT GAA ACT AGT ACA CAA ACA TAT GAT TCA AAA AGA AAC AAT GGC 1344 
Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

CAT GTA AGT GCA CAG GAT TCT ATT GAC CAA TTA CCG CCA GAA ACA ACA 13 92 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

GAT GAA CCA CTT GAA AAA GCA TAT AGT CAT CAG CTT AAT TAC GCG GAA 1440 
Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

TGT TTC TTA ATG CAG GAC CGT CGT GGA ACA ATT CCA TTT TTT ACT TGG 1488 
Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

ACA CAT AGA AGT GTA GAC TTT TTT AAT ACA ATT GAT GCT GAA AAG ATT 1536 
Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

ACT CAA CTT CCA GTA GTG AAA GCA TAT GCC TTG TCT TCA GGT GCT TCC 1584 
Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

ATT ATT GAA GGT CCA GGA TTC ACA GGA GGA AAT TTA CTA TTC CTA AAA 1632 



WO 99/31248 



PCT/US98/26852 



213 

lie lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

GAA TCT AGT AAT TCA ATT GCT AAA TTT AAA GTT ACA TTA AAT TCA GCA 1680 
Glu Ser Ser Asn Ser lie Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

GCC TTG TTA CAA CGA TAT CGT GTA AGA ATA CGC TAT GCT TCT ACC ACT 1728 
Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

AAC TTA CGA CTT TTT GTG CAA AAT TCA AAC AAT GAT TTT CTT GTC ATC 1776 
Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val lie 
580 585 590 

TAC ATT AAT AAA ACT ATG AAT AAA GAT GAT GAT TTA ACA TAT CAA ACA 1824 
Tyr lie Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

TTT GAT CTC GCA ACT ACT AAT TCT AAT ATG GGG TTC TCG GGT GAT AAG 1872 
Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

AAT GAA CTT ATA ATA GGA GCA GAA TCT TTC GTT TCT AAT GAA AAA ATC 1920 
Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

TAT ATA GAT AAG ATA GAA TTT ATC CCA GTA CAA TTG TAA 1959 
Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
1 5 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 



Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 
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Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 I 10 

Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
22£ 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 " 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
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375 380 



Lys Ser Xhr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tvr Arq Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
y 405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Lsu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He lie Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2000 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

CCATCCATGG CAAACCCTAA CAATCGTTCC GAACACGACA CCATCAAGGT TACTCCAAAC 50 

TCTGAGTTGC AAACTAATCA CAACCAGTAC CCATTGGCTG ACAATCCTAA CAGTACTCTT 120 

GAGGAACTTA ACTACAAGGA GTTTCTCCGG ATGACCGAAG ATAGCTCCAC TGAGGTTCTC 180 

GATAACTCTA CAGTGAAGGA CGCTGTTGGA ACTGGCATTA GCGTTGTGGG ACAGATTCTT 24 0 

GGAGTGGTTG GTGTTCCATT CGCTGGAGCT TTGACCAGCT TCTACCAGTC CTTTCTCAAC 3 00 

ACCATCTGGC CTTCAGATGC TGATCCCTGG AAGGCTTTCA TGGCCCAAGT GGAAGTCTTG 360 

ATCGATAAGA AGATCGAAGA GTATGCCAAG TCTAAAGCCT TGGCTGAGTT GCAAGGTTTG 42 0 

CAGAACAACT TCGAGGATTA CGTCAACGCA CTCAACAGCT GGAAGAAAAC TCCCTTGAGT 480 

CTCAGGTCTA AGCGTTCCCA GGACCGTATT CGTGAACTTT TCAGCCAAGC CGAATCCCAC 540 

TTCAGAAACT CCATGCCTAG CTTTGCCGTT TCTAAGTTCG AGGTGCTCTT CTTGCCAACA 6 00 

TACGCACAAG CTGCCAACAC TCATCTCTTG CTTCTCAAAG ACGCTCAGGT GTTTGGTGAG 660 

GAATGGGGTT ACTCCAGTGA AGATGTTGCC GAGTTCTACC GTAGGCAGCT CAAGTTGACT 720 

CAACAGTACA CAGACCACTG CGTCAACTGG TACAACGTTG GGCTCAATGG TCTTAGAGGA 780 

TCTACCTACG ACGCATGGGT GAAGTTCAAC AGGTTTCGTA GAGAGATGAC CTTGACTGTG 840 

CTCGATCTTA TCGTTCTCTT TCCATTCTAC GACATTCGTC TTTACTCCAA AGGCGTTAAG 900 

ACAGAGCTGA CCAGAGACAT CTTCACCGAT CCCATCTTCC TACTTACGAC CCTGCAGAAA 96 0 

TACGGTCCAA CTTTTCTCTC CATTGAGAAC AGCATCAGGA AGCCTCACCT CTTCGACTAT 1020 

CTGCAAGGCA TTGAGTTTCA CACCAGGTTG CAACCTGGTT ACTTCGGTAA GGATTCCTTC 1080 

AACTACTGGA GCGGAAACTA CGTTGAAACC AGACCATCCA TCGGATCTAG CAAGACCATC 1140 

ACTTCTCCAT TCTACGGTGA CAAGAGCACT GAGCCAGTGC AGAAGTTGAG CTTCGATGGG 1200 

CAGAAGGTGT ATAGAACCAT CGCCAATACC GATGTTGCAG CTTGGCCTAA TGGCAAGGTC 1260 

TACCTTGGAG TTACTAAAGT GGACTTCTCC CAATACGACG ATCAGAAGAA CGAGACATCT 1320 

ACTCAAACCT ACGATAGTAA GAGGAACAAT GGCCATGTTT CCGCACAAGA CTCCATTGAC 1380 
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CAACTTCCAC CTGAAACCAC TGATGAACCA TTGGAGAAGG CTTACAGTCA CCAACTTAAC 1440 

TACGCCGAAT GCTTTCTCAT GCAAGACAGG CGTGGCACCA TTCCGTTCTT TACATGGACT 1500 

CACAGGTCTG TCGACTTCTT TAACACTATC GACGCTGAGA AGATTACCCA ACTTCCCGTG 1560 

GTCAAGGCTT ATGCCTTGTC CAGCGGAGCT TCCATCATTG AAGGTCCAGG CTTCACCGGT 1620 

GGCAACTTGC TCTTCCTTAA GGAGTCCAGC AACTCCATCG CCAAGTTCAA AGTGACACTT 1680 

AACTCAGCAG CCTTGCTCCA ACGTTACAGG GTTCGTATCA GATACGCAAG CACTACCAAT 174 0 

CTTCGCCTCT TTGTCCAGAA CAGCAACAAT GATTTCCTTG TCATCTACAT CAACAAGACT 1800 

ATGAACAAAG ACGATGACCT CACCTACCAA ACATTCGATC TTGCCACTAC CAATAGTAAC i860 

ATGGGATTCT CTGGTGACAA GAACGAGCTG ATCATAGGTG CTGAGAGCTT TGTCTCTAAT 1920 

GAGAAGATTT ACATAGACAA GATCGAGTTC ATTCCAGTTC AACTCTAATA GATCCCCCGG 1980 

GCTGCAGGAA TTCGATATCA 2000 
(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 653 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Met Ala Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr 
15 10 15 

Pro Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp 
20 25 30 

Asn Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg 
35 40 45 

Met Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys 
50 55 60 

Asp Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val 
65 70 75 80 

Val Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe 
85 90 95 



Leu Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met 
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100 



105 



110 



Ala Gin Val Glu Val Leu lie Asp Lys Lys He Glu Glu Tyr Ala Lys 
115 120 125 

Ser Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp 
130 135 140 

Tyr Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg 

155 160 



14 5 



150 



Ser Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu 
165 170 175 

Ser His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu 
180 185 190 

Val Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu 
195 200 205 

Leu Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser 
210 215 220 

Glu Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin 

235 240 



225 



230 



Tyr Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu 
245 250 255 

Arg Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg 
260 265 270 

Glu Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr 
275 280 285 

Asp He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp 
290 295 300 

He Phe Thr Asp Pro He Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly 
305 310 315 320 

Pro Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe 
325 330 335 

Asp Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr 
340 345 350 

Phe Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr 
355 360 365 

Arg Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly 
370 375 380 

Asp Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys 
385 390 395 400 
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Val Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly 
405 410 415 

Lys Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp 
420 425 - 430 

Gin Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn 
435 440 445 

Gly His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr 
450 455 460 

Thr Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala 
465 470 475 480 

Glu Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr 
485 490 495 

Trp Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys 
500 505 510 

He Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala 
515 520 525 

Ser He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu 
530 535 540 

Lys Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser 
545 550 555 560 

Ala Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr 
565 570 575 

Thr Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val 
580 585 590 

He Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin 
595 600 605 

Thr Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp 
610 615 620 

Lys Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys 
625 630 635 "0 

He Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 



) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2050 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

TGGAGCTCCA CCGCGGTGGC GGCCGCTCTA GAACTAGTGG ATCTAGGCCT CCATATGAAC 60 

CCTAACAATC GTTCCGAACA CGACACCATC AAGGTTACTC CAAACTCTGA GTTGCAAACT 120 

AATCACAACC AGTACCCATT GGCTGACAAT CCTAACAGTA CTCTTGAGGA ACTTAACTAC 180 

AAGGAGTTTC TCCGGATGAC CGAAGATAGC TCCACTGAGG TTCTCGATAA CTCTACAGTG 240 

AAGGACGCTG TTGGAACTGG CATTAGCGTT GTGGGACAGA TTCTTGGAGT GGTTGGTGTT 300 

CCATTCGCTG GAGCTTTGAC CAGCTTCTAC CAGTCCTTTC TCAACACCAT CTGGCCTTCA 360 

GATGCTGATC CCTGGAAGGC TTTCATGGCC CAAGTGGAAG TCTTGATCGA TAAGAAGATC 420 

GAAGAGTATG CCAAGTCTAA AGCCTTGGCT GAGTTGCAAG GTTTGCAGAA CAACTTCGAG 480 

GATTACGTCA ACGCACTCAA CAGCTGGAAG AAAACTCCCT TGAGTCTCAG GTCTAAGCGT 540 

TCCCAGGACC GTATTCGTGA ACTTTTCAGC CAAGCCGAAT CCCACTTCAG AAACTCCATG 600 

CCTAGCTTTG CCGTTTCTAA GTTCGAGGTG CTCTTCTTGC CAACATACGC ACAAGCTGCC 660 

AACACTCATC TCTTGCTTCT CAAAGACGCT CAGGTGTTTG GTGAGGAATG GGGTTACTCC 720 

AGTGAAGATG TTGCCGAGTT CTACCATAGG CAGCTCAAGT TGACTCAACA GTACACAGAC 780 

CACTGCGTCA ACTGGTACAA CGTTGGGCTC AATGGTCTTA GAGGATCTAC CTACGACGCA 840 

TGGGTGAAGT TCAACAGGTT TCGTAGAGAG ATGACCTTGA CTGTGCTCGA TCTTATCGTT 900 

CTCTTTCCAT TCTACGACAT TCGTCTTTAC TCCAAAGGCG TTAAGACAGA GCTGACCAGA 960 

GACATCTTCA CCGATCCCAT CTTCTCACTT AACACCCTGC AGGAATACGG TCCAACTTTT 1020 

CTCTCCATTG AGAACAGCAT CAGGAAGCCT CACCTCTTCG ACTATCTGCA AGGCATTGAG 1080 

TTTCACACCA GGTTGCAACC TGGTTACTTC GGTAAGGATT CCTTCAACTA CTGGAGCGGA 1140 

AACTACGTTG AAACCAGACC ATCCATCGGA TCTAGCAAGA CCATCACTTC TCCATTCTAC 1200 

GGTGACAAGA GCACTGAGCC AGTGCAGAAG TTGAGCTTCG ATGGGCAGAA GGTGTATAGA 1260 

ACCATCGCCA ATACCGATGT TGCAGCTTGG CCTAATGGCA AGGTCTACCT TGGAGTTACT 1320 

AAAGTGGACT TCTCCCAATA CGACGATCAG AAGAACGAGA CATCTACTCA AACCTACGAT 1380 

AGTAAGAGGA ACAATGGCCA TGTTTCCGCA CAAGACTCCA TTGACCAACT TCCACCTGAA 1440 

ACCACTGATG AACCATTGGA GAAGGCTTAC AGTCACCAAC TTAACTACGC CGAATGCTTT 1500 
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CTCATGCAAG ACAGGCGTGG CACCATTCCG TTCTTTACAT GGACTCACAG GTCTGTCGAC 156 0 

TTCTTTAACA CTATCGACGC TGAGAAGATT ACCCAACTTC CCGTGGTCAA GGCTTATGCC 1620 

TTGTCCAGCG GAGCTTCCAT CATTGAAGGT CCAGGCTTCA CCGGTGGCAA CTTGCTCTTC 16 80 

CTTAAGGAGT CCAGCAACTC CATCGCCAAG TTCAAAGTGA CACTTAACTC AGCAGCCTTG 1740 

CTCCAACGTT ACAGGGTTCG TATCAGATAC GCAAGCACTA CCAATCTTCG CCTCTTTGTC 1800 

CAGAACAGCA ACAATGATTT CCTTGTCATC TACATCAACA AGACTATGAA CAAAGACGAT 1860 

GACCTCACCT ACAACACATT CGATCTTGCC ACTACCAATA GTAACATGGG ATTCTCTGGT 1920 

GACAAGAACG AGCTGATCAT AGGTGCTGAG AGCTTTGTCT CTAATGAGAA GATTTACATA 1980 

GACAAGATCG AGTTCATTCC AGTTCAACTC TAATAGATCC CCCGGGCTGC AGGAATTCGA 2040 

TATCAAGCTT 2050 
(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2280 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

TTAAAATTAA TTTTGTATAC TTTTCATTGT AATAATATGA TTTTAAAAAC GAAAAAGTGC 60 

ATATACAACT TATCAGGAGG GGGGGGATGC ACAAAGAAGA AAAGAATAAG AAGTGAATGT 120 

TTATAATGTT CAATAGTTTT ATGGGAAGGC ATTTTATCAG GTAGAAAGTT ATGTATTATG 180 

ATAAGAATGG GAGGAAGAAA AATGAATCCA AACAATCGAA GTGAACATGA TACGATAAAG 240 

GTTACACCTA ACAGTGAATT GCAAACTAAC CATAATCAAT ATCCTTTAGC TGACAATCCA 300 

AATTCAACAC TAGAAGAATT AAATTATAAA GAATTTTTAA GAATGACTGA AGACAGTTCT 36 0 

ACGGAAGTGC TAGACAACTC TACAGTAAAA GATGCAGTTG GGACAGGAAT TTCTGTTGTA 4 20 

GGGCAGATTT TAGGTGTTGT AGGAGTTCCA TTTGCTGGGG CACTCACTTC ATTTTATCAA 480 

TCATTTCTTA ACACTATATG GCCAAGTGAT GCTGACCCAT GGAAGGCTTT TATGGCACAA 540 

GTTGAAGTAC TGATAGATAA GAAAATAGAG GAGTATGCTA AAAGTAAAGC TCTTGCAGAG 600 

TTACAGGGTC TTCAAAATAA TTTCGAAGAT TATGTTAATG CGTTAAATTC CTGGAAGAAA 66 0 
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ACACCTTTAA GTTTGCGAAG TAAAAGAAGC CAAGATCGAA TAAGGGAACT TTTTTCTCAA 720 
GCAGAAAGTC ATTTTCGTAA TTCCATGCCG TCATTTGCAG TTTCCAAATT CGAAGTGCTG 780 
TTTCTACCAA CATATGCACA AGCTGCAAAT ACACATTTAT TGCTATTAAA AGATGCTCAA 840 
GTTTTTGGAG AAGAATGGGG ATATTCTTCA GAAGATGTTG CTGAATTTTA TCATAGACAA 900 
TTAAAACTTA CACAACAATA CACTGACCAT TGTGTTAATT GGTATAATGT TGGATTAAAT 960 

GGTTTAAGAG GTTCAACTTA TGATGCATGG GTCAAATTTA ACCGTTTTCG CAGAGAAATG 1020 

ACTTTAACTG TATTAGATCT AATTGTACTT TTCCCATTTT ATGATATTCG GTTATACTCA 1080 

AAAGGGGTTA AAACAGAACT AACAAGAGAC ATTTTTACGG ATCCAATTTT TTCACTTAAT 1140 

ACTCTTCAGG AGTATGGACC AACTTTTTTG AGTATAGAAA ACTCTATTCG AAAACCTCAT 1200 

TTATTTGATT ATTTACAGGG GATTGAATTT CATACGCGTC TTCAACCTGG TTACTTTGGG 126 0 

AAAGATTCTT TCAATTATTG GTCTGGTAAT TATGTAGAAA CTAGACCTAG TATAGGATCT 13 2 0 

AGTAAGACAA TTACTTCCCC ATTTTATGGA GATAAATCTA CTGAACCTGT ACAAAAGCTA 13 80 

AGCTTTGATG GACAAAAAGT TTATCGAACT ATAGCTAATA CAGACGTAGC GGCTTGGCCG 1440 

AATGGTAAGG TATATTTAGG TGTTACGAAA GTTGATTTTA GTCAATATGA TGATCAAAAA 1500 

AATGAAACTA GTACACAAAC ATATGATTCA AAAAGAAACA ATGGCCATGT AAGTGCACAG 1560 

GATTCTATTG ACCAATTACC GCCAGAAACA ACAGATGAAC CACTTGAAAA AGCATATAGT 1620 

CATCAGCTTA ATTACGCGGA ATGTTTCTTA ATGCAGGACC GTCGTGGAAC AATTCCATTT 1680 

TTTACTTGGA CACATAGAAG TGTAGACTTT TTTAATACAA TTGATGCTGA AAAGATTACT 174 0 

CAACTTCCAG TAGTGAAAGC ATATGCCTTG TCTTCAGGTG CTTCCATTAT TGAAGGTCCA 1800 

GGATTCACAG GAGGAAATTT ACTATTCCTA AAAGAATCTA GTAATTCAAT TGCTAAATTT I86 0 

AAAGTTACAT TAAATTCAGC AGCCTTGTTA CAACGATATC GTGTAAGAAT ACGCTATGCT 192 0 

TCTACCACTA ACTTACGACT TTTTGTGCAA AATTCAAACA ATGATTTTCT TGTCATCTAC 1980 

ATTAATAAAA CTATGAATAA AGATGATGAT TTAACATATC AAACATTTGA TCTCGCAACT 204 0 

ACTAATTCTA ATATGGGGTT CTCGGGTGAT AAGAATGAAC TTATAATAGG AGCAGAATCT 2100 

TTCGTTTCTA ATGAAAAAAT CTATATAGAT AAGATAGAAT TTATCCCAGT ACAATTGTAA 2160 

GGAGATTTTA AAATGTTGGG TGATGGTCAA AATGAAAGAA TAGGAAGGTG AATTTTGATG 2220 

GTTAGGAAAG ATTCTTTTAA CAAAAGCAAC ATGGAAAAGT ATACAGTACA AATATTAACC 228 0 



(2) INFORMATION FOR SEQ ID NO: 103: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
TAGGCCTCCA TCCATGGCAA ACCCTAACAA TC 
(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
TCCCATCTTC CTACTTACGA CCCTGCAGAA ATACGGTCCA AC 
(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
GACCTCACCT ACCAAACATT CGATCTTG 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
CGAGTTCTAC CGTAGGCAGC TCAAG 25 
(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1959 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

ATGAATCCAA ACAATCGAAG TGAACATGAT ACGATAAAGG TTACACCTAA CAGTGAATTG 60 

CAAACTAACC ATAATCAATA TCCTTTAGCT GACAATCCAA ATTCAACACT AGAAGAATTA 120 

AATTATAAAG AATTTTTAAG AATGACTGAA GACAGTTCTA CGGAAGTGCT AGACAACTCT 180 

ACAGTAAAAG ATGCAGTTGG GACAGGAATT TCTGTTGTAG GGCAGATTTT AGGTGTTGTA 240 

GGAGTTCCAT TTGCTGGGGC ACTCACTTCA TTTTATCAAT CATTTCTTAA CACTATATGG 300 

CCAAGTGATG CTGACCCATG GAAGGCTTTT ATGGCACAAG TTGAAGTACT GATAGATAAG 360 

AAAATAGAGG AGTATGCTAA AAGTAAAGCT CTTGCAGAGT TACAGGGTCT TCAAAATAAT 420 

TTCGAAGATT ATGTTAATGC GTTAAATTCC TGGAAGAAAA CACCTTTAAG TTTGCGAAGT 480 

AAAAGAAGCC AAGGTCGAAT AAGGGAACTT TTTTCTCAAG CAGAAAGTCA TTTTCGTAAT 540 

TCCATGCCGT CATTTGCAGT TTCCAAATTC GAAGTGCTGT TTCTACCAAC ATATGCACAA 600 

GCTGCAAATA CACATTTATT GCTATTAAAA GATGCTCAAG TTTTTGGAGA AGAATGGGGA 660 

TATTCTTCAG AAGATGTTGC TGAATTCTAT CGTAGACAAT TAAAACTTAC ACAACAATAC 720 

ACTGACCATT GTGTTAATTG GTATAATGTT GGATTAAATG GTTTAAGAGG TTCAACTTAT 780 

GATGCATGGG TCAAATTTAA CCGTTTTCGC AGAGAAATGA CTTTAACTGT ATTAGATCTA 840 

ATTGTACTTT TCCCATTTTA TGATATTCGG TTATACTCAA AAGGGGTTAA AACAGAACTA 900 

ACAAGAGACA TTTTTACGGA TCCAATTTTT TTACTTACTA CGCTTCAGAA GTACGGACCA 960 

ACTTTTTTGA GTATAGAAAA CTCTATTCGA AAACCTCATT TATTTGATTA TTTACAGGGG 1020 

ATTGAATTTC ATACGCGTCT TCAACCTGGT TACTTTGGGA AAGATTCTTT CAATTATTGG 1080 
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TCTGGTAATT ATGTAGAAAC TAGACCTAGT ATAGGATCTA GTAAGACAAT TACTTCCCCA 114 0 

TTTTATGGAG ATAAATCTAC TGAACCTGTA CAAAAGCTAA GCTTTGATGG ACAAAAAGTT 1200 

TATCGAACTA TAGCTAATAC AGACGTAGCG GCTTGGCCoA ATGGTAAGGT ATATTTAGGT 1260 

GTTACGAAAG TTGATTTTAG TCAATATGAT GATCAAAAAA ATGAAACTAG TACACAAACA 132 0 

TATGATTCAA AAAGAAACAA TGGCCATGTA AGTGCACAGG ATTCTATTGA CCAATTACCG 138 0 

CCAGAAACAA CAGATGAACC ACTTGAAAAA GCATATAGTC ATCAGCTTAA TTACGCGGAA 144 0 

TGTTTCTTAA TGCAGGACCG TCGTGGAACA ATTCCATTTT TTACTTGGAC ACATAGAAGT 1500 

GTAGACTTTT TTAATACAAT TGATGCTGAA AAGATTACTC AACTTCCAGT AGTGAAAGCA 156 0 

TATGCCTTGT CTTCAGGTGC TTCCATTATT GAAGGTCCAG GATTCACAGG AGGAAATTTA 1620 

CTATTCCTAA AAGAATCTAG TAATTCAATT GCTAAATTTA AAGTTACATT AAATTCAGCA 1680 

GCCTTGTTAC AACGATATCG TGTAAGAATA CGCTATGCTT CTACCACTAA CTTACGACTT 174 0 

TTTGTGCAAA ATTCAAACAA TGATTTTCTT GTCATCTACA TTAATAAAAC TATGAATAAA 18 00 

GATGATGATT TAACATATCA AACATTTGAT CTCGCAACTA CTAATTCTAA TATGGGGTTC 1860 

TCGGGTGATA AGAATGAACT TATAATAGGA GCAGAATCTT TCGTTTCTAA TGAAAAAATC 1920 

TATATAGATA AGATAGAATT TATCCCAGTA CAATTGTAA 1959 
(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 10 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 
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Ala Val Gly Thr Gly lie Ser Val Val Gly Gin lie Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr lie Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
ioo 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr Arg Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Leu Leu Thr Thr Leu Gin Lys Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
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355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He lie Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 
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' (2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 649 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr lie Lys Ala Thr Glu 
15 10 15 

Asn Asn Glu Val Ser Asn Asn His Ala Gin Tyr Pro Leu Ala Asp Thr 
.20 25 30 

Pro Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Arg Thr Thr 
35 40 45 

Asp Asn Asn Val Glu Ala Leu Asp Ser Ser Thr Thr Lys Asp Ala lie 
50 55 60 

Gin Lys Gly He Ser He He Gly Asp Leu Leu Gly Val Val Gly Phe 
65 70 75 80 

Pro Tyr Gly Gly Ala Leu Val Ser Phe Tyr Thr Asn Leu Leu Asn Thr 
85 90 95 

He Trp Pro Gly Glu Asp Pro Leu Lys Ala Phe Met Gin Gin Val Glu 
100 105 110 

Ala Leu He Asp Gin Lys He Ala Asp Tyr Ala Lys Asp Lys Ala Thr 
115 120 125 

Ala Glu Leu Gin Gly Leu Lys Asn Val Phe Lys Asp Tyr Val Ser Ala 
130 135 140 

Leu Asp Ser Trp Asp Lys Thr Pro Leu Thr Leu Arg Asp Gly Arg Ser 
145 150 155 160 

Gin Gly Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser His Phe Arg 
165 170 175 

Arg Ser Met Pro Ser Phe Ala Val Ser Gly Tyr Glu Val Leu Phe Leu 
180 185 190 

Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu Leu Lys Asp 
195 200 205 



Ala Gin He Tyr Gly Thr Asp Trp Gly Tyr Ser Thr Asp Asp Leu Asn 
210 215 220 
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Glu Phe His Thr Lys Gin Lys Asp Leu Thr He Glu Tyr Thr Asn His 
225 230 235 240 

Cys Ala Lys Trp Tyr Lys Ala Gly Leu Asp Lys Leu Arg Gly Ser Thr 
245 250 255 

Tyr Glu Glu Trp Val Lys Phe Asn Arg Tyr Arg Arg Glu Met Thr Leu 
260 265 270 

Thr val Leu Asp Leu He Thr Leu Phe Pro Leu Tyr Asp Val Arg Thr 
275 280 285 

Tyr Thr Lys Gly Val Lys Thr Glu Leu Thr Arg Asp val Leu Thr Asp 
290 295 30° 

Pro He Val Ala Val Asn Asn Met Asn Gly Tyr Gly Thr Thr Phe Ser 
305 310 315 320 

Asn He Glu Asn Tyr He Arg Lys Pro His Leu Phe Asp Tyr Leu His 
325 330 335 

Ala He Gin Phe His Ser Arg Leu Gin Pro Gly Tyr Phe Gly Thr Asp 
340 345 350 

Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Ser Thr Arg Ser Ser He 
355 360 365 

Gly Ser Asp Glu He He Arg Ser Pro Phe Tyr Gly Asn Lys Ser Thr 
370 375 380 

Leu Asp Val Gin Asn Leu Glu Phe Asn Gly Glu Lys Val Phe Arg Ala 
385 390 395 400 

Val Ala Asn Gly Asn Leu Ala Val Trp Pro Val Gly Thr Gly Gly Thr 
405 410 415 

Lys He His Ser Gly Val Thr Lys Val Gin Phe Ser Gin Tyr Asn Asp 
420 425 430 

Arg Lys Asp Glu Val Arg Thr Gin Thr Tyr Asp Ser Lys Arg Asn Val 
435 440 445 

Gly Gly He Val Phe Asp Ser He Asp Gin Leu Pro Pro He Thr Thr 
450 455 460 

Asp Glu Ser Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Val Arg 
465 470 475 480 

Cys Phe Leu Leu Gin Gly Gly Arg Gly He He Pro Val Phe Thr Trp 
485 490 495 

Thr His Lys Ser Val Asp Phe Tyr Asn Thr Leu Asp Ser Glu Lys He 
500 505 510 

Thr Gin He Pro Phe Val Lys Ala Phe He Leu Val Asn Ser Thr Ser 
515 520 525 
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Val Val Ala Gly Pro Gly Phe Thr Gly Gly Asp He He Lys Cys Thr 
530 * 535 540 

Asn Gly Ser Gly Leu Thr Leu Tyr Val Thr Pro Ala Pro Asp Leu Thr 
545 550 555 560 

Tyr Ser Lys Thr Tyr Lys He Arg He Arg Tyr Ala Ser Thr Ser Gin 
565 570 575 

Val Arg Phe Gly He Asp Leu Gly Ser Tyr Thr His Ser He Ser Tyr 
580 585 590 

Phe Asp Lys Thr Met Asp Lys Gly Asn Thr Leu Thr Tyr Asn Ser Phe 
595 600 605 

Asn Leu Ser Ser Val Ser Arg Pro He Glu He Ser Gly Gly Asn Lys 
610 615 620 

He Gly Val Ser Val Gly Gly He Gly Ser Gly Asp Glu Val Tyr He 
625 630 635 640 

Asp Lys He Glu Phe He Pro Met Asp 
645 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Met Asn Pro Asn Asn Arg Ser Glu His Asp Thr He Lys Val Thr Pro 
15 io 15 

Asn Ser Glu Leu Pro Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 ~ 70 75 80 

Gly val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
8 5 90 95 
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Asp Thr He Trp Pro Ser Asp Ala' Asp Pro Trp Lys Ala Phe Met Ala 
100 105 110 

Gin Val Glu Val Leu He Asp Lys Lys He Qlu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 «0 

Lys Arg Ser Gin Asp Arg lie Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 

Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

Val Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Ser 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
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390 395 400 



Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

He Tyr Phe Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Gly Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu £sn Ser Ala 
545 550 555 560 

Ala Leu Leu Gin Arg Tyr Arg Val Arg He Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe He Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn He Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Thr 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr He Asp Lys He Glu Phe He Pro Val Gin Leu 
645 65 0 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Met Asn Pro Asn Asn Arg Ser Olu His Asp Thr He Lys Val Thr Pro 
15 io 15 

Asn Ser Glu Leu Gin Thr Asn His Asn Gin Tyr Pro Leu Ala Asp Asn 
20 25 30 

Pro Asn Ser Thr Leu Glu Glu Leu Asn Tyr Lys Glu Phe Leu Arg Met 
35 40 45 

Thr Glu Asp Ser Ser Thr Glu Val Leu Asp Asn Ser Thr Val Lys Asp 
50 55 60 

Ala Val Gly Thr Gly He Ser Val Val Gly Gin He Leu Gly Val Val 
65 70 75 80 

Gly Val Pro Phe Ala Gly Ala Leu Thr Ser Phe Tyr Gin Ser Phe Leu 
85 90 95 

Asn Thr He Trp Pro Ser Asp Ala Asp Pro Trp Lys Ala Phe Met Ala 
100 105 HO 

Gin Val Glu Val Leu He Asp Lys Lys He Glu Glu Tyr Ala Lys Ser 
115 120 125 

Lys Ala Leu Ala Glu Leu Gin Gly Leu Gin Asn Asn Phe Glu Asp Tyr 
130 135 140 

Val Asn Ala Leu Asn Ser Trp Lys Lys Thr Pro Leu Ser Leu Arg Ser 
145 150 155 160 

Lys Arg Ser Gin Asp Arg He Arg Glu Leu Phe Ser Gin Ala Glu Ser 
165 170 175 

His Phe Arg Asn Ser Met Pro Ser Phe Ala Val Ser Lys Phe Glu Val 
180 185 190 

Leu Phe Leu Pro Thr Tyr Ala Gin Ala Ala Asn Thr His Leu Leu Leu 
195 200 205 

Leu Lys Asp Ala Gin Val Phe Gly Glu Glu Trp Gly Tyr Ser Ser Glu 
210 215 220 

Asp Val Ala Glu Phe Tyr His Arg Gin Leu Lys Leu Thr Gin Gin Tyr 
225 230 235 240 

Thr Asp His Cys Val Asn Trp Tyr Asn Val Gly Leu Asn Gly Leu Arg 
245 250 255 
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Gly Ser Thr Tyr Asp Ala Trp Val Lys Phe Asn Arg Phe Arg Arg Glu 
260 265 270 

Met Thr Leu Thr Val Leu Asp Leu He Val Leu Phe Pro Phe Tyr Asp 
275 280 285 

He Arg Leu Tyr Ser Lys Gly Val Lys Thr Glu Leu Thr Arg Asp He 
290 295 300 

Phe Thr Asp Pro He Phe Ser Leu Asn Thr Leu Gin Glu Tyr Gly Pro 
305 310 315 320 

Thr Phe Leu Ser He Glu Asn Ser He Arg Lys Pro His Leu Phe Asp 
325 330 335 

Tyr Leu Gin Gly He Glu Phe His Thr Arg Leu Gin Pro Gly Tyr Phe 
340 345 350 

Gly Lys Asp Ser Phe Asn Tyr Trp Ser Gly Asn Tyr Val Glu Thr Arg 
355 360 365 

Pro Ser He Gly Ser Ser Lys Thr He Thr Ser Pro Phe Tyr Gly Asp 
370 375 380 

Lys Ser Thr Glu Pro Val Gin Lys Leu Ser Phe Asp Gly Gin Lys Val 
385 390 395 400 

Tyr Arg Thr He Ala Asn Thr Asp Val Ala Ala Trp Pro Asn Gly Lys 
405 410 415 

Val Tyr Leu Gly Val Thr Lys Val Asp Phe Ser Gin Tyr Asp Asp Gin 
420 425 430 

Lys Asn Glu Thr Ser Thr Gin Thr Tyr Asp Ser Lys Arg Asn Asn Gly 
435 440 445 

His Val Ser Ala Gin Asp Ser He Asp Gin Leu Pro Pro Glu Thr Thr 
450 455 460 

Asp Glu Pro Leu Glu Lys Ala Tyr Ser His Gin Leu Asn Tyr Ala Glu 
465 470 475 480 

Cys Phe Leu Met Gin Asp Arg Arg Gly Thr He Pro Phe Phe Thr Trp 
485 490 495 

Thr His Arg Ser Val Asp Phe Phe Asn Thr He Asp Ala Glu Lys He 
500 505 510 

Thr Gin Leu Pro Val Val Lys Ala Tyr Ala Leu Ser Ser Gly Ala Ser 
515 520 525 

He He Glu Gly Pro Gly Phe Thr Gly Gly Asn Leu Leu Phe Leu Lys 
530 535 540 

Glu Ser Ser Asn Ser He Ala Lys Phe Lys Val Thr Leu Asn Ser Ala 
545 550 555 560 
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Ala Leu Leu Gin Arg Tyr Arg Val Arg lie Arg Tyr Ala Ser Thr Thr 
565 570 575 

Asn Leu Arg Leu Phe Val Gin Asn Ser Asn Asn Asp Phe Leu Val He 
580 585 590 

Tyr He Asn Lys Thr Met Asn Lys Asp Asp Asp Leu Thr Tyr Gin Thr 
595 600 605 

Phe Asp Leu Ala Thr Thr Asn Ser Asn Met Gly Phe Ser Gly Asp Lys 
610 615 620 

Asn Glu Leu He He Gly Ala Glu Ser Phe Val Ser Asn Glu Lys He 
625 630 635 640 

Tyr lie Asp Lys He Glu Phe He Pro Val Gin Leu 
645 650 

) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 659 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Met He Arg Met Gly Gly Arg Lys Met Asn Pro Asn Asn Arg Ser Glu 
15 10 15 

Tyr Asp Thr He Lys Val Thr Pro Asn Ser Glu Leu Pro Thr Asn His 
20 25 30 

Asn Gin Tyr Pro Leu Ala Asp Asn Pro Asn Ser Thr Leu Glu Glu Leu 
35 40 45 

Asn Tyr Lys Glu Phe Leu Arg Met Thr Ala Asp Asn Ser Thr Glu Val 
50 55 60 

Leu Asp Ser Ser Thr Val Lys Asp Ala Val Gly Thr Gly lie Ser Val 
65 70 75 80 

Val Gly Gin He Leu Gly Val Val Gly Val Pro Phe Ala Gly Ala Leu 
85 90 95 

Thr Ser Phe Tyr Gin Ser Phe Leu Asn Ala He Trp Pro Ser Asp Ala 
100 105 110 



Asp Pro Trp Lys Ala Phe Met Ala Gin Val Glu Val Leu lie Asp Lys 
115 120 125 
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Lys lie Glu Glu Tyr Ala Lys Ser Lys Ala Leu Ala Glu Leu Gin Gly 
130 135 140 

Leu Gin Asn Asn Phe Glu Asp Tyr Val Asn Ala Leu Asp Ser Trp Lys 
145 150 155 160 

Lys Ala Pro Val Asn Leu Arg Ser Arg Arg Ser Gin Asp Arg lie Arg 
165 170 175 

Glu Leu Phe Ser Gin Ala Glu Ser His Phe Arg Asn Ser Met Pro Ser 
180 1B5 190 

Phe Ala Val Ser Lys Phe Glu Val Leu Phe Leu Pro Thr Tyr Ala Gin 
195 200 205 

Ala Ala Asn Thr His Leu Leu Leu Leu Lys Asp Ala Gin Val Phe Gly 
210 215 220 

Glu Glu Trp Gly Tyr Ser Ser Glu Asp lie Ala Glu Phe Tyr Gin Arg 
225 230 235 240 

Gin Leu Lys Leu Thr Gin Gin Tyr Thr Asp His Cys Val Asn Trp Tyr 
245 250 255 

Asn Val Gly Leu Asn Ser Leu Arg Gly Ser Thr Tyr Asp Ala Trp Val 
260 265 270 

Lys Phe Asn Arg Phe Arg Arg Glu Met Thr Leu Thr Val Leu Asp Leu 
275 280 285 

lie Val Leu Phe Pro Phe Tyr Asp Val Arg Leu Tyr Ser Lys Gly Val 
290 295 300 

Lys Thr Glu Leu Thr Arg Asp lie Phe Thr Asp Pro lie Phe Thr Leu 
305 310 315 320 

Asn Ala Leu Gin Glu Tyr Gly Pro Thr Phe Ser Ser He Glu Asn Ser 
325 330 335 

He Arg Lys Pro His Leu Phe Asp Tyr Leu Arg Gly He Glu Phe His 
340 345 350 

Thr Arg Leu Arg Pro Gly Tyr Ser Gly Lys Asp Ser Phe Asn Tyr Trp 
355 360 365 

Ser Gly Asn Tyr Val Glu Thr Arg Pro Ser He Gly Ser Asn Asp Thr 
370 375 380 

He Thr Ser Pro Phe Tyr Gly Asp Lys Ser He Glu Pro He Gin Lys 
385 390 395 400 

Leu Ser Phe Asp Gly Gin Lys Val Tyr Arg Thr He Ala Asn Thr Asp 
405 410 415 



He Ala Ala Phe Pro Asp Gly Lys He Tyr Phe Gly Val Thr Lys Val 
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420 



425 



430 



Asp Phe Ser Gin Tyr Asp Asp Gin Lys Asn Glu Thr Ser Thr Gin Thr 
435 440 445 

Tyr Asp Ser Lys Arg Tyr Asn Gly Tyr Leu Gly Ala Gin Asp Ser He 
450 455 460 

Asp Gin Leu Pro Pro Glu Thr Thr Asp Glu Pro Leu Glu Lys Ala Tyr 
465 470 475 480 

Ser His Gin Leu Asn Tyr Ala Glu Cys Phe Leu Met Gin Asp Arg Arg 
485 490 495 

Gly Thr He Pro Phe Phe Thr Trp Thr His Arg Ser Val Asp Phe Phe 
500 505 510 

Asn Thr He Asp Ala Glu Lys He Thr Gin Leu Pro Val Val Lys Ala 
515 520 525 

Tyr Ala Leu Ser Ser Gly Ala Ser He He Glu Gly Pro Gly Phe Thr 
530 535 540 

Gly Gly Asn Leu Leu Phe Leu Lys Glu Ser Ser Asn Ser He Ala Lys 
545 550 555 560 

Phe Lys Val Thr Leu Asn Ser Ala Ala Leu Leu Gin Arg Tyr Arg Val 
565 570 575 

Arg He Arg Tyr Ala Ser Thr Thr Asn Leu Arg Leu Phe Val Gin Asn 
580 585 590 

Ser Asn Asn Asp Phe Leu Val He Tyr He Asn Lys Thr Met Asn He 
595 600 605 

Asp Gly Asp Leu Thr Tyr Gin Thr Phe Asp Phe Ala Thr Ser Asn Ser 
610 615 620 

Asn Met Gly Phe Ser Gly Asp Thr Asn Asp Phe He He Gly Ala Glu 
625 630 635 640 

Ser Phe Val Ser Asn Glu Lys He Tyr He Asp Lys He Glu Phe He 
645 650 655 



Pro Val Gin 



INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Met lie Arg Lys Gly Gly Arg Lys Met Asn Pro Asn Asn Arg Ser Glu 
1 5 10 15 

His Asp Thr He Lys Thr Thr Glu Asn Asn Glu Val Pro Thr Asn His 
20 25 30 

Val Gin Tyr Pro Leu Ala Glu Thr Pro Asn Pro Thr Leu Glu Asp Leu 
35 40 45 

Asn Tyr Lys Glu Phe Leu Arg Met Thr Ala Asp Asn Asn Thr Glu Ala 
50 55 60 

Leu Asp Ser Ser Thr Thr Lys Asp Val He Gin Lys Gly He Ser Val 
65 " 70 75 80 

Val Gly Asp Leu Leu Gly Val Val Gly Phe Pro Phe Gly Gly Ala Leu 
85 90 55 

val Ser Phe Tyr Thr Asn Phe Leu Asn Thr He Trp Pro Ser Glu Asp 
100 105 HO 

Pro Trp Lys Ala Phe Met Glu Gin Val Glu Ala Leu Met Asp Gin Lys 
115 120 125 

He Ala Asp Tyr Ala Lys Asn Lys Ala Leu Ala Glu Leu Gin Gly Leu 
130 135 140 

Gin Asn Asn Val Glu Asp Tyr Val Ser Ala Leu Ser Ser Trp Gin Lys 
145 150 155 160 

Asn Pro Val Ser Ser Arg Asn Pro His Ser Gin Gly Arg He Arg Glu 
165 170 175 

Leu Phe Ser Gin Ala Glu Ser His Phe Arg Asn Ser Met Pro Ser Phe 
180 185 130 

Ala He Ser Gly Tyr Glu Val Leu Phe Leu Thr Thr Tyr Ala Gin Ala 
195 200 205 

Ala Asn Thr His Leu Phe Leu Leu Lys Asp Ala Gin He Tyr Gly Glu 
210 215 220 

Glu Trp Gly Tyr Glu Lys Glu Asp He Ala Glu Phe Tyr Lys Arg Gin 
225 230 235 240 

Leu Lys Leu Thr Gin Glu Tyr Thr Asp His Cys Val Lys Trp Tyr Asn 
245 250 255 

Val Gly Leu Asp Lys Leu Arg Gly Ser Ser Tyr Glu Ser Trp Val Asn 
260 265 270 
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Phe Asn Arg Tyr Arg Arg Glu Met Thr Leu Thr Val Leu Asp Leu He 
275 280 285 

Ala Leu Phe Pro Leu Tyr Asp Val Arg Leu Tyr Pro Lys Glu Val Lys 
290 - 295 300- 

Thr Glu Leu Thr Arg Asp Val Leu Thr Asp Pro He Val Gly Val Asn 
305 310 315 320 

Asn Leu Arg Gly Tyr Gly Thr Thr Phe Ser Asn He Glu Asn Tyr He 
325 330 335 

Arg Lys Pro His Leu Phe Asp Tyr Leu His Arg He Gin Phe His Thr 
340 345 350 

Arg Phe Gin Pro Gly Tyr Tyr Gly Asn Asp Ser Phe Asn Tyr Trp Ser 
355 360 365 

Gly Asn Tyr Val Ser Thr Arg Pro Ser He Gly Ser Asn Asp He He 
370 375 380 

Thr Ser Pro Phe Tyr Gly Asn Lys Ser Ser Glu Pro Val Gin Asn Leu 
385 390 395 400 

Glu Phe Asn Gly Glu Lys Val Tyr Arg Ala Val Ala Asn Thr Asn Leu 
405 410 415 

Ala Val Trp Pro Ser Ala Val Tyr Ser Gly Val Thr Lys Val Glu Phe 
420 425 430 

Ser Gin Tyr Asn Asp Gin Thr Asp Glu Ala Ser Thr Gin Thr Tyr Asp 
435 440 445 

Ser Lys Arg Asn Val Gly Ala Val Ser Trp Asp Ser lie Asp Gin Leu 
450 455 460 

Pro Pro Glu Thr Thr Asp Glu Pro Leu Glu Lys Gly Tyr Ser His Gin 
465 470 475 480 

Leu Asn Tyr Val Met Cys Phe Leu Met Gin Gly Ser Arg Gly Thr He 
485 490 495 

Pro Val Leu Thr Trp Thr His Lys Ser Val Asp Phe Phe Asn Met He 
500 505 510 

Asp Ser Lys Lys He Thr Gin Leu Pro Leu Val Lys Ala Tyr Lys Leu 
515 520 525 

Gin Ser Gly Ala Ser Val Val Ala Gly Pro Arg Phe Thr Gly Gly Asp 
530 535 540 

lie lie Gin Cys Thr Glu Asn Gly Ser Ala Ala Thr lie Tyr Val Thr 
545 * 550 555 '560 

Pro Asp Val Ser Tyr Ser Gin Lys Tyr Arg Ala Arg lie His Tyr Ala 
565 570 575 
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Ser Thr Ser Gin lie Thr Phe Thr 
580 

Asn Gin Tyr Tyr Phe Asp Lys Thr 
595 600 

Tyr Asn Ser Phe Asn Leu Ala Ser 
610 615 

Gly Asn Asn Leu Gin He Gly Val 
625 630 

Val Tyr He Asp Lys He Glu Phe 
645 



Leu Ser Leu Asp Gly Ala Pro Phe 
585 590 

He Asn Lys Gly Asp Thr Leu Thr 
605 

Phe Ser Thr Pro Phe Glu Leu Ser 
620 

Thr Gly Leu Ser Ala Gly Asp Lys 
635 640 

He Pro Val Asn 
650 
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